The Product Team is at the core of designing and building the future of data collection, management and analysis. We leverage and develop the most modern open source and proprietary technologies in Machine Learning, Distributed Computing and Robotic Process Automation.
Our team is building a suite of machine learning tools to help solve problems in the life science space. This includes the classification of researchers and physicians to their research assets (publications, patents, books, clinical trials, conferences..etc.), predicting the altruistic activities of donors to non-for-profit foundations, and much more. We are looking for data scientists who are not only interested in plugging data into a model, but also taking a deep dive into the academic research world.
- Analyze requirements and formulate an appropriate technical solution that meets functional and non-functional requirements.
- Experience with large datasets in the 100's of GB
- Fundamental/broad understanding of data mining and predictive analytics techniques
- 3+ years of Data Science Experience and a deep knowledge of various modeling techniques.
- Python (Pandas, SKLearn, NumPy, Matplotlib), SQL, Git
- Strong communication skills - both verbal and written – is a must.
- Familiar with Agile Methodologies and Tools (Jira)
- Strong software engineering skills
- Experience with Spark or Hadoop