As first published in InfoWorld

The past 16 months have revealed how valuable data science can be while also exposing its limitations. Expect big advances in the year to come.

2020 could be called The Year Data Science Grew Up. Organizations of all kinds significantly ramped up their adoption of data-oriented applications and turned to data science to solve their problems — with varying degrees of success. In the process, data science was increasingly called upon to show its maturity and prove its real value, demonstrating that it actually worked in production.

The emergence of a deadly global pandemic threw a wrench into designs — not all of them good — that had grown over the course of years in ways that have become difficult…

As first published in The Next Web

The data science dilemma: automation, APIs, or custom data science?

As companies place an increasing premium on data science, there is some debate about which approach is best to adopt — and there is no straight up, one-size-fits-all answer. It really depends on your organization’s needs and what you hope to accomplish.

There are three main approaches that have been discussed over the past couple of years; it’s worth taking a look at the merits and limitations of each as well as the human element involved. …


Not sure what movie to watch? Ask your recommender system

As first published in Datanami

Collaborative filtering (CF) based on the alternating least squares (ALS) technique is another algorithm used to generate recommendations. It produces automatic predictions (filtering) about the interests of a user by collecting preferences from many other users (collaborating). The underlying assumption of the CF approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than a randomly chosen person. …


Automatic tagging of disease names in biomedical literature

By Jeanette (Jeany) Prinz

The rapid growth in the amount of biomedical literature becoming available makes it impossible for humans alone to extract and exhaust all of the useful information it contains. There is simply too much there. Despite our best efforts, many things would fall through the cracks, including valuable disease-related information. Hence, automated access to disease information is an important goal of text-mining efforts [1]. This enables, for example, the integration with other data types and the generation of new hypotheses by combining facts that have been extracted from several sources [2].

In this use case, we will…


Is forecasting of soccer tournaments really possible?

I have never believed much in predicting the outcome of major sport tournaments. For two main reasons: I am not a sport expert and sport tournaments always include an amount of randomness which is hard to predict. This especially applies to soccer games.

Well, apparently, I was wrong and Yodime was right.

Figure 1. Yodime (by permission of owner).

Yodime is that sort of genetic cross between a Star Wars guru (Yoda) and a data science guru (KNIME). …

An interview with Dean Abbott and John Elder about change management, complexity, interpretability, and the risk of AI taking over humanity


After the KNIME Fall Summit , the dinosaurs went back home… well, switched off their laptops. Dean Abbott and John Elder , longstanding data science experts, were invited to the Fall Summit by Michael to join him in a discussion of The Future of Data Science: A Fireside Chat with Industry Dinosaurs . The result was a sparkling conversation about data science challenges and new trends. Since switching off the studio lights, Rosaria has distilled and expanded some of the highlights about change management, complexity, interpretability, and more in the data science world. …


Practice and knowledge will design the best recipe for each case

What are the steps in data preparation? Are there specific steps we need to take for specific problems? The answer is not that straightforward: Practice and knowledge will design the best recipe for each case.

First, there are two types of data preparation: KPI calculation to extract the information from the raw data and data preparation for the data science algorithm. While the first one is domain and business dependent, the second one is more standardized.

In this article, we focus on operations to prepare data to feed a machine learning algorithm. There are many of these data operations, some…

KNIME Community

A new kind of KNIME event on July 7, 10 AM (Berlin) or 12 PM (Chicago)

Avatar for Rosaria Silipo
Avatar for Rosaria Silipo

Hi! Meet my avatar. I think it looks a bit like me, don’t you think? In a low-resolution kind of way…

My avatar and I are attending the next KNIME Data Talks — Community Edition, where we hope to meet and network with other KNIME user avatars. Yes, for the next KNIME Data Talks event, you need to come with your own avatar!

Let’s proceed now with a little more order.

The KNIME Data Talks — Community Edition will take place on July 7 at two different times: 10:00 AM UTC +2 (Berlin) and 12:00 PM UTC -5 (Chicago). Same…


You have just downloaded KNIME Analytics Platform, what next?

Here are seven steps for a fast and practical, learning-by-doing start to using it. After you’ve got started, take a look at more educational material, like for example one of our e-learning courses, onsite courses, cheatsheets, e-books, videos, local meetup events, and more. Our “sat nav” for finding the educational resources that most suit your skills and time constraints is here in the blog article Get on Board and Navigate the Learning Options at KNIME!

The Seven Things To Do

  1. Explore the welcome page
  2. Learn from a pre-built workflow
  3. Get familiar with the workbench of KNIME Analytics Platform
  4. Find more resources on the KNIME Hub…

Components | Metanodes | KNIME Analytics Platform

What is a metanode? What is a component? And when do you use what?

Let’s help clarify the differences between metanodes and components in KNIME Analytics Platform.

The common goal: make order in a messy workflow

Both metanodes and components are useful to clean up messy workflows. You can identify isolated blocks of logical operations in your workflows and include them inside either a metanode or a component. Your workflow will appear neat and tidy with less nodes than the original workflow.

And that is where the metanode goal in life ends.

Figure 1. Two visual configurations of the same example workflow. The usage of metanode and components (right) makes the view neat and clear.

What can a component do that a metanode cannot?

Let’s see now what a component can do additionally in comparison with a metanode.

A component can encapsulate flow variables

“What happens in the component stays in the component.” This sentence describes the vacuum character of a…

Rosaria Silipo

Rosaria has been mining data since her master degree, through her doctorate and job positions after that . She is now a data scientist and KNIME evangelist.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store