NLP, Technology

This Machine Learning Glossary aims to briefly introduce the most important Machine Learning terms – both for the commercially and…

Getting started with AI? Perhaps you’ve already got your feet wet in the world of Machine Learning, but still looking to expand your knowledge and cover the subjects you’ve heard of but didn’t quite have time to cover?

1. NLP – Natural Language Processing

Natural Language Processing (NLP) is a common notion for a variety of Machine Learning methods that make it possible for the computer to understand and perform operations using human (i.e. natural) language as it is spoken or written.

The most important use cases of Natural Language Processing are:

He goal of this task is to predict a class (label) of a document, or rank documents within in a list based on their relevance. It could be used in spam filtering (predicting whether an e-mail is spam or not) or content classification (selecting articles from the web about what is happening to your competitors).

2. Reinforcement learning

Reinforcement Learning differs in its approach from the approaches we’ve described earlier. In RL the algorithm plays a “game”, in which it aims to maximize the reward. The algorithm tries different approaches “moves” using trial-and-error and sees which one boost the most profit.

3. Dataset

All the data that is used for either building or testing the ML model is called a dataset. Basically, data scientists divide their datasets into three separate groups:

- Training data is used to train a model. It means that ML model sees that data and learns to detect patterns or determine which features are most important during prediction.

- Validation data is used for tuning model parameters and comparing different models in order to determine the best ones. The validation data should be different from the training data, and should not be used in the training phase. Otherwise, the model would overfit, and poorly generalize to the new (production) data.

- It may seem tedious, but there is always a third, final test set (also often called a hold-out). It is used once the final model is chosen to simulate the model’s behaviour on a completely unseen data, i.e. data points that weren’t used in building models or even in deciding which model to choose.

DataRobot's platform makes my work exciting, my job fun, and the results more accurate and timely -- it's almost like magic!

Aron Larsson

– CEO, Strategy Director

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

3 replies on “AI technology for a better tomorrow”