Interview with a Data Scientist: Rosaria Silipo
As part of my Interview with Data Scientists project I recently caught up with Rosaria – who is an active member of the Data Mining community.
Bio: Rosaria has been a researcher in applications of Data Mining and Machine Learning for over a decade. Application fields include biomedical systems and data analysis, financial time series (including risk analysis), and automatic speech processing.
She is currently based in Zurich (Switzerland).
- What project have you worked on do you wish you could go back to, and do better?
There is not such a thing like the perfect project! As close as you can be to perfection, at some point you need to stop either because
the time is over or because the money is over or because you just need to have a productive solution. I am sure I can go back to all my past projects
and find something to improve in each of them!
This is actually one of the biggest issues in a data analytics projects: when do we stop? Of course, you need to identify some basic
deliverables in the project initial phase,
without which the project is not satisfactorily completed.
But once you have passed these deliverable
milestones, when do you stop?
What is the right compromise between perfection and resource investment?
In addition, every few years some new technology becomes available which could help
re-engineering your old projects, for speed or accuracy or both.
So, even the most perfect project solution, after a few years, can surely be improved due to new technologies. This is, for example, the case
of the new big data platforms. Most of my old projects would benefit now from a
big data based speeding operation. This could help to
speed up old models training and deployment, to create more complex data analytics models, and to optimize model paramters better.
- What advice do you have to younger analytics professionals and in particular PhD
students in the Sciences?
Use your time to learn! Data Science is a relatively new discipline that combines old
knowledge, such as statstics and machine learning, with
newer wisdom, like big data platforms and parallel computation. Not many people
know everything here, really! So, take your time to learn what you
do not know yet from the experts in that area.
Combining a few different pieces of data science knowledge probably
makes you unique already in the data science landscape.
The more pieces of different knowledge, the bigger of an advantage for you
in the data science ecosystem!
One way to get easy hands-on experience on a different range of application fields is to
explore the Kaggle challenges
Kaggle has a number of interesting challenges up every months and who knows
you might also win some money!
- What do you wish you knew earlier about being a data scientist?
This answer is related to the previous one, since my advise to
young data scientists sprouts from my earlier experience and failures.
My early background is in machine learning.
So, when I moved my first steps in the
data science world many years ago, I thought that knowledge of
machine learning algorithms was all I needed. I wish!
I had to learn that data science is the sum of many different skills,
including data collection and data cleaning and transformation. The latter,
for example, is highly underestimated! In all data science projects I have seen (not only mine), the data processing part takes way more than 50% of the used resources!
Including also data visualization and data presentation. A genial solution is worth nothing if the executives and stakeholders do not understand the results by means of
a clear and compact representation! And so on. I guess I wish I took more time early on to learn from colleagues with a different set of skills than mine.
- How do you respond when you hear the phrase ‘big data’?
Do you really need big data? Sometimes customers ask for a big data platform just because. Then when you investigate deeper you realize
that they really do not have and do not want to have such a big amount of data to take care of every day. A nice traditional DWH (Data Warehouse) solution is definitely enough for them.
Sometimes though, a big data solution is really needed or at least it will be needed
- What is the most exciting thing about your field?
Probably, the variety of applications. The whole knowledge of data collection, data warehousing, data analytics, data visualization, results inspection and
presentation is transveral to a number of application fields. You would be surprised at how many different applications can be designed using a variation of the same
data science technique! Once you have the data science knowledge and a particular application request, all you need is imagination to make the two match and find the best solution.
- How do you go about framing a data problem – in particular, how do you avoid spending too long, how do you manage expectations etc. How do you know what is good enough?
I always propose a first pilot/investigation mini-project at the very beginning. This is for me to get a better idea of the application specs, of the data set, and yes also
of the customer. This is a crucial phase, though short.
During this part, in fact, I can take the measures of the project in terms of needed time and resources, and
I and the customer we can study each other and adjust our expectations about input data and final results.
This initial phase, usually involves a sample of the data, an understanding of the data update strategy, some visual investigation, and a first tentative analysis to
produce the requested results.
Once this part is successful and expectations have been adjusted on both sides, the real project can start.
- You spent sometime as a Consultant in Data Analytics. How did you manage cultural challenges, dealing with stakeholders and executives? What advice do you have for new starters
about this?
Ah … I am really not a very good example for dealing with
stakeholders and executives and successfully manage cultural
challenges!
Usually, I rely on external collaborators to handle this part for me, also because of time constraints.
I see myself as a technical professional, with little time for talking and convincing. Unfortunately, because this is a big part of each data analytics project.
However, when I have to deal with it myself,
I let the facts speak for me: final or
intermediate results of current and past projects.
This is the easiest way to convince stakeholders that the project is worth the time and the money.
For any occurrence, though, I always have at hand a set of slides
with previous accomplishements to present to executives if and when needed.
- Tell us about something cool you’ve been doing in Data Science lately.
My latest project was about anomaly detection in industry. I found it a very interesting problem to solve, where skills and expertise have to meet creativity.
In anomaly detection you have no historical records of anomalies, either because they rarely happen or because they are too expensive to let them happen.
What you have is a data set of records of normal functioning of the machine, transactions, system, or whatever it is you are observing.
The challenge then is to predict anomalies before they happen and without previous historical examples. That is where the creativity comes in.
Traditional machine learning algorithms need a twist in application to provide an adequate solution for this problem.
Source: https://peadarcoyle.wordpress.com/2015/10/03/interview-with-a-data-scientist-rosaria-silipo/
Anyone can join.
Anyone can contribute.
Anyone can become informed about their world.
"United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.
Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world. Anyone can join. Anyone can contribute. Anyone can become informed about their world. "United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.
LION'S MANE PRODUCT
Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules
Mushrooms are having a moment. One fabulous fungus in particular, lion’s mane, may help improve memory, depression and anxiety symptoms. They are also an excellent source of nutrients that show promise as a therapy for dementia, and other neurodegenerative diseases. If you’re living with anxiety or depression, you may be curious about all the therapy options out there — including the natural ones.Our Lion’s Mane WHOLE MIND Nootropic Blend has been formulated to utilize the potency of Lion’s mane but also include the benefits of four other Highly Beneficial Mushrooms. Synergistically, they work together to Build your health through improving cognitive function and immunity regardless of your age. Our Nootropic not only improves your Cognitive Function and Activates your Immune System, but it benefits growth of Essential Gut Flora, further enhancing your Vitality.
Our Formula includes: Lion’s Mane Mushrooms which Increase Brain Power through nerve growth, lessen anxiety, reduce depression, and improve concentration. Its an excellent adaptogen, promotes sleep and improves immunity. Shiitake Mushrooms which Fight cancer cells and infectious disease, boost the immune system, promotes brain function, and serves as a source of B vitamins. Maitake Mushrooms which regulate blood sugar levels of diabetics, reduce hypertension and boosts the immune system. Reishi Mushrooms which Fight inflammation, liver disease, fatigue, tumor growth and cancer. They Improve skin disorders and soothes digestive problems, stomach ulcers and leaky gut syndrome. Chaga Mushrooms which have anti-aging effects, boost immune function, improve stamina and athletic performance, even act as a natural aphrodisiac, fighting diabetes and improving liver function. Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules Today. Be 100% Satisfied or Receive a Full Money Back Guarantee. Order Yours Today by Following This Link.
