When Magento meets Python (episode new Business Logic using ML)

A little digression before starting: I truly believe that, in few years, the approach of implementing an algorithm by code, writing it in some specific language, will be totally replaced with ML, that will figure it out during its training.

Let’s think about it: if you use ML to desume something empiric from available data (for example predicting the price of an house given its position, size, condition and so on), why not to prepare some data already having a deterministic behavior, with the goal of simply train a model and predict new values, without the need of implementing an algorithm?

Let’s see a little PoC, involving some fictional Magento raw data, to see if can work it out.

Imagine your Store Manager, wanting to implement the Perfect Promotion, a formula crafted with the most advanced tool in the universe (an excel sheet), based on particular order conditions.

Following the usual way, Dev Team should first understand it, finding all the possibile outcomes and then code it, meaning days of development, testing and bug fixing.

Let’s try a ML approach with a simple (and a bit silly) example.

In this order file, the final price is calculated with a formula, based on the sum of customers first name and last name length, the payment method and if the customer is a recurring one.

Don’t understand the formula? Good, you don’t need it, that’s the idea!

So, let’s train a simple regression model to have the possibility to predict new values, (hopefully) similar to the formula output.

Let’s import the data and remove unnecessary columns, then we transform the payment method in numeric values and we create two distinct sets, one for the training and one for the testing (20% of the data), with the “final_price” as the target variable (the value to predict).

Now let’s train the model with the test set and let’s check how it performs using some KPIs

Mmm results are not good, our Store Manager will be not happy…

This is an example of “underfitting”: our model is not performing very well with this data and, basically, it’s learning poorly.

To resolve it, we can add more info and, in this case, go “polynomial”: intuitively, imagine to have only a 2-dimensional set: this is how underfitting (and its opposite, overfitting) works.

Our dataset is 9th dimensional, so it cannot be visualized, but the point is the same.

We know there is a function that will fit almost perfectly, because we created the data following a formula.

Let’s try using “PolynomialFeatures”, that will create additional features using polynomial combinations of the existing ones with degrees 2 and 3.

Bingo, almost perfect! Let’s try some predictions

Not bad, considering the time and the code necessary to achieve this, compared to the classic if-then approach…

See you next time!

Tech consultant (antonellocalamea.com) | Avid learner | Composer | Proudly believing less is more, except for love and knowledge

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store