`python-m5p` - M5 Prime regression trees in python, compliant with scikit-learn
Sylvain Marié
Regression trees are powerful Machine Learning models capable of both flexibility in modeling as well as interpretability when the tree is not too deep. The M5 algorithm was introduced by Quinlan in 1992 under the name "model tree" ; the algorithm is derived from classic regression trees (e.g. CART, Breiman et al., 1984), adding the possibility to prune the tree and use linear regression models at leaves. The goal is to reduce the number of branches and leaves in the tree, making it ultimately more interpretable and smooth. The M5 algorithm was improved by Wang & Witten in 1997, under the name M5 Prime (acronym M5' or M5P). The algorithm gained popularity in particular a dozen years later with the Weka Machine Learning toolbox, providing a java-based implementation.
python-m5p
is an implementation of the M5P algorithm compliant with scikit-learn.
Sylvain Marié
Affiliation: Schneider Electric
Sylvain received his General Engineering degree from CentraleSupelec (Paris) and a MSc in Machine Learning from UCL (London) in 2005 - with awards for his thesis on Automated Medical Diagnosis using Semi-Supervised Learning. He joined Schneider Electric as an embedded software engineer, to imagine how industrial gateways could leverage SOA/M2M/Web2.0. His work within EU-funded innovation projects was published and transferred into industrial IoT offers.
In 2010, Sylvain joined an energy efficiency program for tertiary buildings, leading Monitoring & Analytics topics. He developped a platform used for BI and Visual Analytics prototypes. He opened collaborations with major universities and labs and assessed key technology partners.
Since 2013 Sylvain is leading projects spanning from AI Research [1] to Analytics-as-a-Service industrialization in multiple market segments, with production targets such as the various Schneider Electric EcoStruxure Advisors [2] and the Exchange [3]. He was the supervisor of four PhD students, animates an internal group of python users, and is an active contributor to the broader Open Source python community, through both flagship libs (scikit-learn, nox, pytest...) or his own libraries (pyfields, pytest-cases, makefun...) [4]. Finally since 2020 Sylvain gives a small "datascience with python" course for Masters students.