Build, Improve and extend NLP capabilities
Research and evaluate new/different approaches to NLP problems
Produce deliverable results and take them from development to production in collaboration with our AI Scientists and Machine Learning Engineers.
Potential for managerial responsibilities within six to 18 months.
Degree in Computer Science, Computational Linguistics or related fields from a top tier university
3+ years of experience with the ability to get deep in the development of the described NLP capabilities below, as well as have managerial potential or some managerial experience
Must have: Strong understanding of text pre-processing and normalization techniques such as Tokenization, POS tagging and parsing and how they work at a low level
Must have: Experience working on millions of text documents
Expertise in at least 3 of the following: Entity Extraction, Relationship extraction, Document Classification, Topic Modeling, Natural Language Understanding (NLU)
Experience with some of the open-source NLP toolkits such as CoreNLP, OpenNLP, spaCy, NLTK, Gensim, LingPipe, Mallet, etc.
Experience with open-source ML/math toolkits such as Scikit-learn, MLlib, Theano, NumPy, etc.
Experience with noisy and/or unstructured textual data (e.g. tweets)
Strong knowledge of Python, and general software development skills (source code management, debugging, testing, deployment, etc)
Expertise in producing, processing, evaluating and utilizing training data.
Strong interest in, and knowledge of Artificial Intelligence and its subfields
Experience with non-English NLP
Experience with Deep Learning and Word Embeddings
Ability to collaborate with bigger teams and excellent communication skills
We believe that the nature of analytical work is undergoing a radical transformation. The explosion in digital information has ushered in an era of unprecedented complexity. Humans are limited in their process complexity and the consequence is biased decisions.
At Accrete, we specialize in overcoming sparse training data challenges to build dynamic continuously learning models. We leverage human experts to create semantically rich training data on the order of less than .1% of the overall data ingested by our learning models. These models underly a platform with a variety of core capabilities including intelligent web crawling, contextual analytics and semantic search.