Feature Analysis for Predicate Argument Identification using Random Forests

less than 1 minute read

There are a variety of characteristics used to identify ARG1’s in the predicate-argument relationships, such as N-gram, predicate, path, and embedding features. Among these, it is unclear which ones are most important when a machine learning model is identifying ARG1’s.

This paper utilizes feature and permutation importance in a binary classification random forest, to assess a multitude of features and determine their impacts. The most important features in the random forest were distance of word to predicate, word to predicate embedding distance, and the word itself.

Final paper for NYU Natural Language Programming Course taught by Adam Meyers.

Share on

Twitter Facebook LinkedIn

Jeremy Lu

Feature Analysis for Predicate Argument Identification using Random Forests

Share on

You may also enjoy

Serverless Web Design for NBA Lineup Performance using AWS

New NBA lineup scraper accounts for restrictions such as player inclusions, player exclusions, team, and minimum minutes played (per lineup)

Determining Historic All-NBA Teams Using a Two Backcourt-Three Frontcourt Format with Machine Learning

Option Pricing with Time-Stepped FBSDE and Deep Learning