Let’s assume your company is trying to answer the following questions:
– Which future opportunities are most likely to successfully close?
– What are the most important factors driving sales growth?
– In which stores should I run my new marketing campaign? For what client groups?
– What is the probability that a particular customer will renew his contract?
– Which clients are most likely to end their relationship with us over the next quarter?
To answer such questions, it’s necessary to analyze historical data (usually derived from users’ interaction with different software apps) and come up with a projection. This is exactly what Data Scientists, who hold one of the most popular and well-paid IT jobs, offer.
The Data Scientist analyzes historical data and finds an algorithm that predicts the future. Any issues surrounding the qualifications or responsibilities of a Data Scientist usually emerge from the following challenges:
- Finding one person who possesses the necessary mix of skills and training is no walk in the park. The role requires statistical and mathematical (sometimes advanced) programming (often R, Python, or even Java / Scala); machine learning knowledge (Clustering, k-NN, Naive Bayes, SVM, and Decision Tree); and advanced knowledge in data handling: e.g., querying, processing and visualization (SQL + Analytics + tools such as D3.js).
- The other complication is related to Big Data, which is widely known to come at us in great volume and variety, at great speed. The term Big Data is used to refer to a very large amount of data (TB or PB) for which processing and storage involve the use of systems that automate and enable parallel workloads. Big Data often comes from the Internet, sensors, logs, etc. The Data Scientist will analyze these huge data sets to find correlations and patterns that can be leveraged as decisional support. So Big Data refers to data storage and processing, while Data Science is about understanding that data.
The chart below illustrates just how hot this job currently is in the U.S. market. Salaries vary quite a bit 1) depending on experience, and 2) due to the fact that some companies say they offer data science jobs, but are actually looking to fill a different sort of role. For example, we cannot consider a Data Scientist to be a person who simply inserts information in a data base. Hiring a proper Data Scientist is a difficult task precisely because of the necessary training and experience.
In Romania, such jobs are also relatively rare. Very few companies can claim that they are doing Data Science in the true sense of the word. Many times, the profile of a Data Scientist in Romania refers to a Java/Scala or Database developer (SQL) who invests time in increasing statistical knowledge, for example. However, the local market is quickly picking up. Companies like Optymyze are looking for candidates for such complex jobs.
Not all is perfect in the world of a Data Scientist. He or she will spend a lot of time (sometimes up to 90%) processing and cleaning data sets to eliminate anomalies, errors and inconsistencies. Rewards are, however, vast and significant – not only from a financial perspective. The Data Scientist plays a major role in shaping strategic trends and decisions in vital activities such as fraud detection, predictive system building, or understanding user behavior (in almost every business and a the highest level).
Article originally published on Wall-Street.ro, in the People Behind the Code section.