The Product Team is at the core of designing and building the future of data collection, management and analysis. We leverage and develop the most modern open source and proprietary technologies in Machine Learning, Distributed Computing and Robotic Process Automation.
Our team is building a suite of machine learning tools to help solve problems in the scientific space. This includes the linking of researchers to their publications, sentiment analysis of citations, record linking, predicting the ROI of a Grant, understanding the landscape of academic research and much more. We are looking for data scientists who are not only interested in plugging data into a model, but also taking a deep dive into the academic research world.
- Analyze requirements and formulate an appropriate technical solution that meets functional and non-functional requirements.
- Experience with large datasets in the 100's of GB
- Fundamental/broad understanding of data mining and predictive analytics techniques
- 2+ years of Data Science Experience and a deep knowledge of various modeling techniques.
- Python (Pandas, SKLearn, NumPy, Matplotlib), SQL, Git
- Strong communication skills - both verbal and written – is a must.
- Familiar with Agile Methodologies and Tools (Jira)
- Strong software engineering skills
- Experience with Spark or Hadoop