Extract content: from web pages and documents such as images, text, HTML, videos and social media
Extract entities: such as keywords, people, organizations, products
Topic classification: understand the topics of discussion in a document
Page type classification: determine the document type such as article, listicle, PDF, long form, terms and conditions, media galleries and many more