tuesday Session List

5 Things we've learned building large APIs with FastAPI

Maarten Huijsmans

APIs, Best Practice

5 the common challenges in building FastAPI apps and how to solve them

Talk general-production

5 Things You Want to Know About AI Adoption in the Enterprise

Alexander CS Hendorf

Architecture, Best Practice, Business & Start-Ups, Corporate, Diversity & Inclusion

All one needs is strategy, skill and resources to make digitalization and AI happen. So why is everything taking so long? 5 Things You Want to Know About AI Adoption in the Enterprise.

Keynote plenary

5 Years, 10 Sprints, A scikit-learn Open Source Journey

Reshama Shaikh

Community, Science, Statistics

In this keynote, I will share highlights, challenges and lessons learned. (https://www.dataumbrella.org/sprints).

Talk pydata-machine-learning-stats

`python-m5p` - M5 Prime regression trees in python, compliant with scikit-learn

Sylvain Marié

Algorithms, Predictive Modelling, Science

`python-m5p` is an implementation of the M5P algorithm compliant with scikit-learn.

Talk general-python-pydata-friends

A data scientist's guide to code reviews

Alexandra Wörner

Coding / Code-Review

Code reviews apply to all data science work - you sometimes just need to tweak them a bit. Let me show you when and how as well as what makes a fruitful code review.

Tutorial pycon-programming-software-engineering

Aspect-oriented Programming - Diving deep into Decorators

Mike Müller

Algorithms, Architecture, Python fundamentals

Functions that take functions and return new functions can be fun. Python's everything-is-an-object principle at work.

Talk pycon-devops

Battle of Pipelines - who will win python orchestration in 2022?

Jannis Grönberg

Architecture, Data Engineering, DevOps

You struggle choosing the right #orchestration tool in #Python ? Join this #PyCon talk about when it's best to use #Kubeflow, #Airflow or #Prefect and learn how to automate your #data #pipelines and #ML workflows. #DataScience #dataengineering #DevOps #MLOps

Talk general-ethics

Biases in Language Models

sonam

Diversity & Inclusion, Ethics (Privacy, Fairness,… ), Natural Language Processing

Study of gender biases in popular language models and debiasing model techniques

Talk pycon-web

But this is an OAuth, is it not?

Sara Jakša

APIs, Backend

OAuth simplified and secured third-party integrations for the end user. But for the developer of the integration, it can still present some friction. This talk talks about examples of real-life problems that were encountered by implementing multiple OAuth integrations.

Talk pydata-computer-vision

Challenge Accepted - How to Escape the Quicksand While Engineering a Computer Vision Application

Bettina Heinlein

Computer Vision

Leveraging problem-solving strategies for challenges in building Computer Vision applications and beyond, illustrated with a recent Computer Vision project.

Talk general-python-pydata-friends

conda-forge: supporting the growth of the volunteer-driven, community-based packaging project

Wolf Vollprecht, Jannis Leidel, Jaime Rodríguez-Guerra

Community, Packaging, Python - PyPy, Cython, Anaconda

How does the conda-forge packaging community work, what is its relationship to conda and PyPI and how can everyone package software with it?

Tutorial pydata-pydata-scientific-libraries-stack

Data Science at Scale with Dask

Richard Pelgrim

APIs, Big Data, Cloud

A hands-on introduction to methods for scaling your data science and machine learning with Dask.

Talk pydata-natural-language-processing

deepdoctection - An open source package for document intelligence

Janis Meyer

Computer Vision, Natural Language Processing

deepdoctection is a Python package that enables document analysis pipelines to be built using deep learning models.

Talk pydata-machine-learning-stats

Detecting drift: how to evaluate and explore data drift in machine learning systems

Emeli Dral

Best Practice, Data Visualization, Statistics

When ML model is in production, you might encounter data and prediction drift. How exactly to detect and evaluate it? I'll share in this talk.

Talk general-community-diversity-carreer-life-and-everything-else

Do we really need Data Scientists?

Dr. Setareh Sadjadi

Career & Freelancing, Community

Is Data Science really cooling down? Do we need Data Scientists? What for?

Tutorial pydata-visualization

Easily build interactive plots and apps with hvPlot

Philipp Rudiger, Maxime Liquet

Data Visualization, Jupyter, Science

Do you use the .plot() API of pandas or xarray? Do you ever wish it was easier to try out different combinations of the parameters in your data-processing pipeline? Follow this tutorial to learn how to easily build interactive plots and apps with hvPlot.

Talk pydata-natural-language-processing

Efficient data labelling with weak supervision

Maria Mestre

Data Engineering, Data Visualization, Natural Language Processing

Data labelling should not be a waterfall task. Label your data significantly faster with weak supervision (https://github.com/dataqa/dataqa)

Tutorial pycon-testing

Faster Workflow with Testdriven Development

Torsten Zielke

Best Practice, Backend, Coding / Code-Review

Learn how to use testdriven development to boost your productivity and let the community do the annoying frequent checkups if the application still works

Talk pydata-pydata-scientific-libraries-stack

Flexible ML Experiment Tracking System for Python Coders with DVC and Streamlit

Antoine Toubhans

Best Practice, Computer Vision, Data Engineering, Data Visualization, Development Methods, Reproducibility

Flexible ML Experiment Tracking System for Python Coders with DVC and Streamlit

Talk general-community-diversity-carreer-life-and-everything-else

Forget ‘web 3.0’, let's talk about ‘web 0.0’. A brief history of the Internet, and the World Wide Web.

Dom Weldon

Art, Social Sciences, Theory

Forget ‘web 3.0’, let's talk about ‘web 0.0’. A brief history of the Internet, and the World Wide Web.

Talk pydata-computer-vision

Grokking LIME: How can we explain why an image classifier "knows" what’s in a photo without looking inside the model?

Kilian Kluge

Computer Vision, Neural Networks / Deep Learning, Transparency / Interpretability

How can LIME explain machine-learning models without peeking inside? Let's find out!

Talk pydata-machine-learning-stats

Honey, I shrunk the target variable! Common pitfalls when transforming the target variable and how to exploit transformations.

Florian Wilhelm

Math, Predictive Modelling, Statistics

Honey, I shrunk the target variable! Common pitfalls when transforming the target variable and how to exploit transformations.

How a simple streamlit dashboard will help to put your machine learning model in production

Daniël Willemsen, Welmoet Verbaan

Best Practice, Data Visualization, Predictive Modelling

Have you struggled getting your valuable machine learning model into the hands of users? A simple streamlit monitoring dashboard can help!

Talk general-community-diversity-carreer-life-and-everything-else

How to deal with toxic people

Gina Häußge

Best Practice, Community

As an open source maintainer, sooner or later you'll encounter ungrateful, entitled or outright toxic people who can be a real drain on your motivation and general mental health. Here are some coping strategies that work for me!

Tutorial general-production

Introduction to MLOps with MLflow

Tobias Sterbak

Best Practice, Predictive Modelling, Reproducibility

Learn the basics of MLops with MLflow to manage the machine learning life-cycle.

Talk pydata-jupyter

JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python

Jeremy Tuloup

Jupyter, Reproducibility, Use Case

JupyterLite is a Jupyter distribution that runs entirely in the web browser, backed by in-browser language kernels such as the WebAssembly powered Pyodide kernel. JupyterLite enables data science and interactive computing with the PyData scientific stack, directly in the browser.

Talk pydata-machine-learning-stats

Machine Learning Testing Ecosystem of Python

Yunus Emrah Bulut

Computer Vision, Ethics (Privacy, Fairness,… ), Governance, Natural Language Processing, Neural Networks / Deep Learning, Security

Machine learning testing becomes an indispensable part of the MLOps and Python offers great ecosystem for this purpose.

Talk pycon-django

Make the most of Django

Paolo Melchiorre

Best Practice, Community, Django

🐍 "Make the most of Django" 👉 Taking full advantage of #OpenSource software means getting involved in its #community and #contributing to its development. We'll see how this is profoundly true in the #Django case as well. #pyconde #talk #python

Talk general-production

Making Machine Learning Applications Fast and Simple with ONNX

Jan-Benedikt Jagusch, Christian Bourjau

Data Engineering, DevOps, Packaging

In this session, you will learn how to use ONNX for your machine learning model deployments, which can reduce your single-row inference time by up to 99% while also drastically simplifying your model management.

Tutorial general-community-diversity-carreer-life-and-everything-else

ML Communication 101: How to talk about Machine Learning with anyone

Julia Ostheimer

Best Practice, Business & Start-Ups, Career & Freelancing, Corporate, Diversity & Inclusion, Ethics (Privacy, Fairness,… ), Transparency / Interpretability, Use Case

You wanna know how you can explain your grandparents what #MachineLearning is? Attend the #PyConDE #PyData tutorial on how to translate #ML terms into everyday language of any audience. #communication #101 #tutorial #softskills #AI

My forecast is better than yours! What does that even mean?

Illia Babounikau

Statistics, Time Series

Established forecast evaluation procedures often turn out to be inappropriate and biased for modern time series forecasting. I will present the number of forecast evaluations issues and resolutions based on the real use cases of demand forecasting developed within BlueYonder.

Overcoming 5 Hurdles to Using Jupyter Notebooks for Data Science, by the JetBrains Datalore Team

Alena Guzharina

Data Visualization, Jupyter, Reproducibility

Overcoming 5 Hurdles to Using Jupyter Notebooks for Data Science, by the JetBrains @Datalore Team Join our talk to discuss setting up environments, working with data, writing code without IDE support, and sharing results, as well as collaboration and reproducibility.

Talk pydata-machine-learning-stats

Predictive Maintenance and Anomaly Detection for Wind Energy

Tobias Hoinka

Predictive Modelling, Statistics, Time Series

This talk will describe predictive modeling applications in wind turbine maintenance, the challenges of anomaly detection and ways to move to more automatic diagnoses by modeling past documented defects.

Panel general-community-diversity-carreer-life-and-everything-else

Python for Everyone - PyLadies' Insights Panel Discussion

Jessica Greene (she/her)

Community, Diversity & Inclusion

Join this panel to learn more about how PyLadies volunteers and organizers make a difference, what they would like the wider python community to understand, so they could be more effective in their work, and what you could do tomorrow, to help advance this work.

Tutorial pydata-pydata-scientific-libraries-stack

Reproducible machine learning and science with python

Prabhant Singh

Best Practice, Community, Science

Learn how to create reproducible workflows, benchmarks and studies with openml-python

Talk pydata-visualization

Sankey Plots with Python

Daniel Ringler

Data Visualization, Jupyter, Python fundamentals

Sankey Plots in Python? Get an introduction on how and when to use them.

Talk pydata-machine-learning-stats

Secure ML: Automated Security Best Practices in Machine Learning

Alejandro Saucedo

Best Practice, Data Engineering, Security

As data science capabilities scale, the core concept of security becomes growingly critical - in this talk we provide an overview of challenges, solutions and best practices to introduce security into the ML lifecycle.

Talk general-python-pydata-friends

Slack bots 101: An introduction into slack bot-based workflow automation

Jordi Smit

APIs, DevOps, Use Case

Most developers work with Slack every day, yet very few of them know about the awesome things you can do when you build your own slack bot. During this talk, we will teach you to build and deploy your first slack bot.

Speeding up Python with Zig

Adam Serafini

Packaging, Performance

Let's speed up Python, with Zig! A tour through Python's C API and packaging challenges...

Talk pycon-programming-software-engineering

Stupid Things I've Done With Python

Mark Smith

Best Practice, Coding / Code-Review, Python fundamentals

On every computer I've had for the past 20 years, I've created a folder called "stupid python tricks". It's where I put code that should never see the light of day. Code I'm going to teach you.

Talk pycon-devops

The state of DevOps for Python projects

Tobias Heintz

Data Engineering, Development Methods, DevOps

How alcemy uses DevOps techniques to streamline and accelerate our daily development. Let's look at a number of real-world examples and best practices taken straight from the pipelines we use to release code several times a day.

Talk general-python-pydata-friends

Unclear Code Hurts

Dario Cannone

Best Practice, Coding / Code-Review

Code may work or not, but it will always tell a story. Computers will not complain about how you write it (except correct syntax), but human readers will. This talk is about writing clear code and caring for the human beings that will read it. Yourself included.

Unsupervised shallow learning for fraud detection on marketplaces

Andreu Mora

Algorithms, Best Practice, Predictive Modelling

Tune in to learn how @adyen uses ML and open source over python to combat fraud and wrongdoings over large marketplaces such as @gofundme or @eBay

Using a database in a data science project - Lessons learned in production

Jacopo Farina

Data Engineering, Databases

Lessons learned in 4 years using Postgres in a machine learning project

Tutorial pycon-devops

We know what your app did last summer. Do you? Observing Python applications using Prometheus.

Jessica Greene (she/her), Vanessa Aguilar

Data Visualization, DevOps, Performance

We know what your app did last summer. Do you? Join us for this practical & theoretical session if you’re looking to grasp the key concepts of observability, useful metrics, and ensuring operational excellence for your Python applications using Prometheus!

Talk pycon-web

Web based live visualisation of sensor data

Jannis Lübbe

APIs, Data Visualization, Use Case

Streaming sensor data to multiple end devices using FastAPI and websockets.

Talk pydata-data-handling

What are data unit tests and why we need them

Theodore Meynard

Best Practice, Data Engineering

This talk will introduce the concept of data unit tests and why they are important in the workflow of data scientists when building data products.

Talk pydata-visualization

Your data, your insights: creating personal data projects to (re-)own the data you share

Paula Gonzalez Avalos

Data Visualization, Predictive Modelling

Your data, your insights: 3 examples to illustrate how we can apply common data science libraries together with data shared via mobile apps or collected manually to build little data visualization projects that provide unique, contextual and intmiate insights.