Talk Session List

Talk pycon-python-language

"Easy Python": lies, damned lies, and metaclasses

Grigory Petrov, Maxim Danilov

Best Practice, Coding / Code-Review, Development Methods

top-10 Python complexities and how they are required to fight the "software complexity problem" in big projects

Talk pydata-pydata-scientific-libraries-stack

5 Steps to Speed Up Your Data-Analysis on a Single Core

Jonathan Striebel

Data Engineering, Performance

Your data analysis pipeline works. Nice. Could it be faster? Probably. Do you need to parallelize? Not yet. Discover optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs.

Talk general-production

5 Things You Want to Know About AI Adoption in the Enterprise

Alexander CS Hendorf

Architecture, Best Practice, Business & Start-Ups, Corporate, Diversity & Inclusion

All one needs is strategy, skill and resources to make digitalization and AI happen. So why is everything taking so long? 5 Things You Want to Know About AI Adoption in the Enterprise.

Talk pydata-machine-learning-stats

`python-m5p` - M5 Prime regression trees in python, compliant with scikit-learn

Sylvain Marié

Algorithms, Predictive Modelling, Science

`python-m5p` is an implementation of the M5P algorithm compliant with scikit-learn.

Talk general-python-pydata-friends

A data scientist's guide to code reviews

Alexandra Wörner

Coding / Code-Review

Code reviews apply to all data science work - you sometimes just need to tweak them a bit. Let me show you when and how as well as what makes a fruitful code review.

Talk pycon-django

Advanced Django ORM

Bas Steins

Databases, Django

Leverage the potential of Django ORM to write complex queries, optimize performance and have fun with constraints

Talk pycon-devops

Battle of Pipelines - who will win python orchestration in 2022?

Jannis Grönberg

Architecture, Data Engineering, DevOps

You struggle choosing the right #orchestration tool in #Python ? Join this #PyCon talk about when it's best to use #Kubeflow, #Airflow or #Prefect and learn how to automate your #data #pipelines and #ML workflows. #DataScience #dataengineering #DevOps #MLOps

Talk general-ethics

Biases in Language Models

sonam

Diversity & Inclusion, Ethics (Privacy, Fairness,… ), Natural Language Processing

Study of gender biases in popular language models and debiasing model techniques

Talk pydata-computer-vision

Building a Sign-to-Speech prototype with TensorFlow, Pytorch and DeepStack: How it happened & What I learned

Steven Kolawole

Computer Vision, Neural Networks / Deep Learning

Building an E2E working prototype that detects sign language meanings in images/videos and generate equivalent voice of words communicated by the sign language, in real-time, won't be completed in a day's work. Here I'd explain how it happened and what I learned in the process.

Talk pycon-libraries

Building an ORM from scratch

Jonathan Oberländer, Patrick Schemitz

Art, Databases

From an empty Python file to a fully-featured ORM in 45 minutes

Talk pycon-web

But this is an OAuth, is it not?

Sara Jakša

APIs, Backend

OAuth simplified and secured third-party integrations for the end user. But for the developer of the integration, it can still present some friction. This talk talks about examples of real-life problems that were encountered by implementing multiple OAuth integrations.

Talk pydata-computer-vision

Can you Read This? (Or: how I Improved Text Readability on the Web for the Visually Impaired)

Asya Frumkin

Algorithms, Computer Vision, Neural Networks / Deep Learning

I will explain my approach of detecting texts on top of an image background that are unreadable to people with visual impairment. I will explain the challenges I. encountered when using different OCR architectures for this task and talk about the solution I came up with.

Talk pydata-computer-vision

Challenge Accepted - How to Escape the Quicksand While Engineering a Computer Vision Application

Bettina Heinlein

Computer Vision

Leveraging problem-solving strategies for challenges in building Computer Vision applications and beyond, illustrated with a recent Computer Vision project.

Talk general-community-diversity-carreer-life-and-everything-else

Come as you are: Transitioning from Science to Data Science

Dr. Hannah Bohle

Career & Freelancing

Come as you are: Transitioning from Science to Data Science. How to find your first job in industry after leaving academia.

Talk general-python-pydata-friends

conda-forge: supporting the growth of the volunteer-driven, community-based packaging project

Wolf Vollprecht, Jannis Leidel, Jaime Rodríguez-Guerra

Community, Packaging, Python - PyPy, Cython, Anaconda

How does the conda-forge packaging community work, what is its relationship to conda and PyPI and how can everyone package software with it?

Talk pydata-pydata-scientific-libraries-stack

Creating 3D Maps using Python

Martin Christen

GIS / Geo-Analytics

Create 3DMaps anywhere on the planet using Python and OpenData

Talk pydata-pydata-scientific-libraries-stack

Data Apis: Standardization of N-dimensional arrays and dataframes

Stephannie Jimenez Gacha

APIs

Introduction to the consortium of Data APIs, where we will be presenting our motivation, objectives and progress of the standardization process after one year of activity.

Talk pydata-natural-language-processing

deepdoctection - An open source package for document intelligence

Janis Meyer

Computer Vision, Natural Language Processing

deepdoctection is a Python package that enables document analysis pipelines to be built using deep learning models.

Talk pycon-python-language

Demystifying Python's Internals: Diving into CPython by implementing a pipe operator

Sebastiaan Zeeff

Python - CPython new features, Python fundamentals

Do you want to dive into the CPython Source Code but feel a bit overwhelmed? Watch Sebastiaan Zeeff demystify CPython's Internals by taking you through the implementation of a new operator.

Talk pydata-machine-learning-stats

Detecting drift: how to evaluate and explore data drift in machine learning systems

Emeli Dral

Best Practice, Data Visualization, Statistics

When ML model is in production, you might encounter data and prediction drift. How exactly to detect and evaluate it? I'll share in this talk.

Talk general-ethics

Do I need to be Dr. Frankenstein to create real-ish synthetic data?

Gatha

Data Engineering, Ethics (Privacy, Fairness,… ), Governance

Synthetic data not only address the privacy needs but also offer workaround for unprecedented situations. This talk introduces their different types, the options for their generation, and how you don't need to be a mad scientist to make realistic synthetic data

Talk general-community-diversity-carreer-life-and-everything-else

Do we really need Data Scientists?

Dr. Setareh Sadjadi

Career & Freelancing, Community

Is Data Science really cooling down? Do we need Data Scientists? What for?

Talk pydata-pydata-scientific-libraries-stack

Easy and flexible imaging with the Core Imaging Library

Vaggelis Papoutsellis, Dr. Jakob Sauer Jørgensen

Algorithms, Big Data, Math

Core Imaging Library is an open-source, object-oriented Python library for inverse problems in imaging developed by the UK academic network CCPi.

Talk pydata-natural-language-processing

Efficient data labelling with weak supervision

Maria Mestre

Data Engineering, Data Visualization, Natural Language Processing

Data labelling should not be a waterfall task. Label your data significantly faster with weak supervision (https://github.com/dataqa/dataqa)

Talk pycon-programming-software-engineering

Fast native data structures: C/C++ from Python

Stefan Behnel

Big Data, Parallel Programming / Async, Python - PyPy, Cython, Anaconda

Need fast data access in Python? Use native data structures with Cython!

Talk pydata-deep-learning

Financial Portfolio Management with Deep Reinforcement Learning

T-Berger

Neural Networks / Deep Learning, Simulation, Time Series

intelligent_portfolio_optimization_with_deep_reinforcement_learning

Talk pydata-pydata-scientific-libraries-stack

Flexible ML Experiment Tracking System for Python Coders with DVC and Streamlit

Antoine Toubhans

Best Practice, Computer Vision, Data Engineering, Data Visualization, Development Methods, Reproducibility

Flexible ML Experiment Tracking System for Python Coders with DVC and Streamlit

Talk pycon-devops

Forget Mono vs. Multi-Repo - Building Centralized Git Workflows with Python

David Melamed

Cloud, Coding / Code-Review, DevOps, Security

No need to reinvent the CI/CD wheel for every service - learn how to build centralized git workflows for all your repos in Python.

Talk general-community-diversity-carreer-life-and-everything-else

Forget ‘web 3.0’, let's talk about ‘web 0.0’. A brief history of the Internet, and the World Wide Web.

Dom Weldon

Art, Social Sciences, Theory

Forget ‘web 3.0’, let's talk about ‘web 0.0’. A brief history of the Internet, and the World Wide Web.

Talk pydata-data-handling

Fundamentals of relational databases

Katharina Rasch

Databases

Somewhat comfortable with using SQL to access data, but curious to know what happens behind the scenes when you send off your query?

Talk pydata-computer-vision

Grokking LIME: How can we explain why an image classifier "knows" what’s in a photo without looking inside the model?

Kilian Kluge

Computer Vision, Neural Networks / Deep Learning, Transparency / Interpretability

How can LIME explain machine-learning models without peeking inside? Let's find out!

Talk pydata-machine-learning-stats

Honey, I shrunk the target variable! Common pitfalls when transforming the target variable and how to exploit transformations.

Florian Wilhelm

Math, Predictive Modelling, Statistics

Honey, I shrunk the target variable! Common pitfalls when transforming the target variable and how to exploit transformations.

Talk general-production

How to build a Python-based Research Cloud Platform from scratch

Andre Fröhlich

Architecture, Business & Start-Ups, Use Case

This talk will present the journey of a quantitative asset manager from an outdated (non-Python) onPrem research setup to a modern Python-centric cloud research platform. We will examine the requirements and challenges associated with the project and present how we navigated find

Talk general-community-diversity-carreer-life-and-everything-else

How to deal with toxic people

Gina Häußge

Best Practice, Community

As an open source maintainer, sooner or later you'll encounter ungrateful, entitled or outright toxic people who can be a real drain on your motivation and general mental health. Here are some coping strategies that work for me!

Talk pycon-programming-software-engineering

How to Find Your Way Through a Million Lines of Code

Jürgen Gmach

Best Practice

Scared of a new project? @jugmac00 will show you "How to Find Your Way Through a Million Lines of Code"

Talk pydata-deep-learning

How to Trust Your Deep Learning Code

Tilman Krokotsch

Best Practice, Neural Networks / Deep Learning

Write unit tests and learn to trust your Deep Learning code again.

Talk general-community-diversity-carreer-life-and-everything-else

Impact of Cultivating a Diverse and Inclusive Workplace

Riya Bansal

Community, Diversity & Inclusion

Let’s face it. The positive impact of diversity and inclusion is no longer debatable.

Talk pydata-pydata-scientific-libraries-stack

Introducing the Dask Active Memory Manager

Guido Imperiale

Algorithms, Architecture, Backend, Cloud, Data Engineering, Distributed Computing, Parallel Programming / Async

The Active Memory Manager is a new experimental feature of Dask which aims to reduce the memory footprint of the cluster, prevent hard to debug out-of-memory issues, and make worker retirement more robust.

Talk pycon-libraries

Introduction to OPC-UA and industrial IoT: Liberate machines from the proprietary clutches of Big Hardware with the power of opcua-asyncio

Joey Faulkner

Backend, Hardware, Networks

Software around industrial hardware is still highly proprietary, which leads to bad UX and inefficient use of hardware. OPC-UA represents an earnest new start at the world of IIoT, and using opcua-asyncio, we can create this revolution in python.

Talk pydata-machine-learning-stats

Introduction to Uplift Modeling

Dr. Juan Orduz

Algorithms, Predictive Modelling, Statistics

In this talk we introduce uplift modelling, a method to estimate conditional average treatment effects (CATE) using machine learning estimators.

Talk pycon-web

It is all about files and HTTP

Efe Öge

APIs, Architecture, Backend, Cloud, DevOps, Django

Managing files won't be easier but more obvious after this talk.

Talk pycon-libraries

jsonargparse - Say goodbye to configuration hassles

Marianne Stecklina

Best Practice

A proper CLI would be nice, but you're way too lazy to write it? Join this talk to learn about the open-source library jsonargparse!

Talk pydata-jupyter

JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python

Jeremy Tuloup

Jupyter, Reproducibility, Use Case

JupyterLite is a Jupyter distribution that runs entirely in the web browser, backed by in-browser language kernels such as the WebAssembly powered Pyodide kernel. JupyterLite enables data science and interactive computing with the PyData scientific stack, directly in the browser.

Talk pydata-machine-learning-stats

Machine Learning Testing Ecosystem of Python

Yunus Emrah Bulut

Computer Vision, Ethics (Privacy, Fairness,… ), Governance, Natural Language Processing, Neural Networks / Deep Learning, Security

Machine learning testing becomes an indispensable part of the MLOps and Python offers great ecosystem for this purpose.

Talk pycon-django

Make the most of Django

Paolo Melchiorre

Best Practice, Community, Django

🐍 "Make the most of Django" 👉 Taking full advantage of #OpenSource software means getting involved in its #community and #contributing to its development. We'll see how this is profoundly true in the #Django case as well. #pyconde #talk #python

Talk general-production

Making Machine Learning Applications Fast and Simple with ONNX

Jan-Benedikt Jagusch, Christian Bourjau

Data Engineering, DevOps, Packaging

In this session, you will learn how to use ONNX for your machine learning model deployments, which can reduce your single-row inference time by up to 99% while also drastically simplifying your model management.

Talk pycon-web

Navigating the limitations of Python’s concurrency model in web services

Tarek Mehrez

APIs, Architecture, Parallel Programming / Async

Ever wondered when you should favor an async web framework? How do they compare to your good old python services when scaling is in question? Then this is the talk for you

Talk pydata-pydata-scientific-libraries-stack

On Blocks, Copies and Views: updating pandas' internals

Joris Van den Bossche

APIs, Data Structures

As a pandas user, did you ever run into the SettingWithCopyWarning? Quite likely, and this is one of the more confusing aspects of pandas. But it doesn’t have to be this way! Check my proposal to simplify this aspect of pandas

Talk pydata-natural-language-processing

Performing Content: Can NLP and Deep Learning algorithms predict reader preferences?

Sebastian Cattes

Natural Language Processing, Neural Networks / Deep Learning, Statistics

Can AI understand what drives user engagement? Join our talk "Performing Content: Can NLP and Deep Learning algorithms predict reader preferences?" to find out what NLP can bring to the editorial table.

Talk pydata-machine-learning-stats

Predictive Maintenance and Anomaly Detection for Wind Energy

Tobias Hoinka

Predictive Modelling, Statistics, Time Series

This talk will describe predictive modeling applications in wind turbine maintenance, the challenges of anomaly detection and ways to move to more automatic diagnoses by modeling past documented defects.

Talk pydata-data-handling

Processing Open Street Map Data with Python and PostgreSQL

Travis Hathaway

Data Engineering, Databases, GIS / Geo-Analytics

Open Street Map is a large, community supported data set covering the entire world. Learn how to process this data with Python and PostgreSQL as I walk you through creating projects of your own. Along the way, we learn how OSM data is structured, and how you can use it yourself.

Talk pycon-python-language

Python 3.10: Welcome to pattern matching!

Laysa Uchoa

Best Practice, Coding / Code-Review, Python fundamentals

Python 3.10: let us learn about Pattern Matching. In this presentation, you will be surprised how simple, yet powerful, Pattern Matching really is. This talk and you, it is a match! 🔥

Talk pycon-devops

Quitting pip: How we use git submodules to manage internal dependencies that require fast iteration

Philipp Stephan

Best Practice, Development Methods, DevOps, Packaging

After a review of the current state of Python dependency management, we’d like to present a versatile method of using git submodules to handle internal dependencies in a dockerized microservice architecture, where common libraries have to be iterated quickly.

Talk general-python-pydata-friends

Rewriting your R analysis code in Python

Helena Schmidt

Best Practice, Development Methods, R

R and Python are two of the most powerful tools for any kind of data analysis. But both programming languages have their strengths and weaknesses. This leads to the question: When and how to rewrite your R analysis code in Python?

Talk pydata-visualization

Sankey Plots with Python

Daniel Ringler

Data Visualization, Jupyter, Python fundamentals

Sankey Plots in Python? Get an introduction on how and when to use them.

Talk pydata-machine-learning-stats

Secure ML: Automated Security Best Practices in Machine Learning

Alejandro Saucedo

Best Practice, Data Engineering, Security

As data science capabilities scale, the core concept of security becomes growingly critical - in this talk we provide an overview of challenges, solutions and best practices to introduce security into the ML lifecycle.

Talk pycon-django

Securing Django Applications

Gajendra Deshpande

Best Practice, Django, Security

In this talk, we will focus on two aspects. First, performing penetration testing on Django web applications to identify vulnerabilities and scanning for OWASP Top 10 risks. Second, strategies and configuration settings for making the source code and Django applications secure.

Talk general-python-pydata-friends

Slack bots 101: An introduction into slack bot-based workflow automation

Jordi Smit

APIs, DevOps, Use Case

Most developers work with Slack every day, yet very few of them know about the awesome things you can do when you build your own slack bot. During this talk, we will teach you to build and deploy your first slack bot.

Talk pydata-data-handling

Squirrel - Efficient Data Loading for Large-Scale Deep Learning

Dr. Thomas Wollmann

Distributed Computing, Neural Networks / Deep Learning, Parallel Programming / Async

Learn why we built and open sourced a data infrastructure library for deep learning.

Talk pycon-programming-software-engineering

Stupid Things I've Done With Python

Mark Smith

Best Practice, Coding / Code-Review, Python fundamentals

On every computer I've had for the past 20 years, I've created a folder called "stupid python tricks". It's where I put code that should never see the light of day. Code I'm going to teach you.

Talk pycon-python-language

The Magic of Python Objects

Coen de Groot

Python fundamentals

Discover the Magic of Python Objects and the 125+ methods that keep them running

Talk general-ethics

The Myth of Neutrality: How AI is widening social divides

Stefanie Stoppel

Ethics (Privacy, Fairness,… ), Neural Networks / Deep Learning

AI is not neutral and its creation often perpetuates harmful biases. My talk highlights how difficult it is to build "fair and responsible" AI, but also why it's worth to try & prevent these algorithms from cementing existing injustices.

Talk pydata-machine-learning-stats

The secret sauce of data science management

Shir Meir Lador

Best Practice, Big Data, Career & Freelancing, Corporate

In this talk, we will discuss lessons learned on how to build a DS team that prospers while addressing the unique challenges of leading a DS team.

Talk pycon-devops

The state of DevOps for Python projects

Tobias Heintz

Data Engineering, Development Methods, DevOps

How alcemy uses DevOps techniques to streamline and accelerate our daily development. Let's look at a number of real-world examples and best practices taken straight from the pipelines we use to release code several times a day.

Talk pycon-python-language

There Are Python 2 Relics in Your Code

Miroslav Šedivý

Coding / Code-Review, Python - CPython new features, Python fundamentals

Should we return to Python 2 or should we get rid of all Python 2 relics from our code?

Talk pydata-natural-language-processing

Transformer based clustering: Identifying product clusters for E-commerce

Sebastian Wanner, Christopher Lennan

Natural Language Processing, Neural Networks / Deep Learning, Use Case

Transformer based clustering with Sentence-Transformers and Facebook Faiss for an E-commerce use case where we clustered offers to automatically generate new products.

Talk pycon-python-language

Trojan Source Malware - Can we trust open-source anymore?

Cheuk Ting Ho

Community, Governance, Python fundamentals, Security, Transparency / Interpretability

Trojan Source Malware has been tested on Python and it works. Shall the Python and open-source communities be concerned?

Talk general-python-pydata-friends

Unclear Code Hurts

Dario Cannone

Best Practice, Coding / Code-Review

Code may work or not, but it will always tell a story. Computers will not complain about how you write it (except correct syntax), but human readers will. This talk is about writing clear code and caring for the human beings that will read it. Yourself included.

Talk general-community-diversity-carreer-life-and-everything-else

Upgrade your Documentation to the Next Level

Shivam Singhal

Community, Development Methods

Learn how to write great documentation to nurture community of your open source project

Talk pycon-web

Web based live visualisation of sensor data

Jannis Lübbe

APIs, Data Visualization, Use Case

Streaming sensor data to multiple end devices using FastAPI and websockets.

Talk pydata-data-handling

What are data unit tests and why we need them

Theodore Meynard

Best Practice, Data Engineering

This talk will introduce the concept of data unit tests and why they are important in the workflow of data scientists when building data products.

Talk general-production

What I learned from monitoring more than 30 Machine Learning Use Cases

Lina Weichbrodt

Best Practice, Backend, DevOps

How to implement #MachineLearning #monitoring for the impatient. Lessons I learned from running more than 30 models in production. And good news, you can use your existing monitoring and dashboard stack like #Prometheus and #Grafana

Talk pydata-natural-language-processing

XAI meets Natural Language Processing

Larissa Haas

Data Visualization, Ethics (Privacy, Fairness,… ), Transparency / Interpretability

XAI meets NLP - approaches, workarounds and lessons learned while making an NLP project explainable

Talk pydata-machine-learning-stats

You shall not share!

Gönül Aycı

Ethics (Privacy, Fairness,… ), Natural Language Processing

Are you ready to have an agent to help to preserve your privacy in online social networks? "You shall not share!" will be presented by @gonul_ayci ⚡️

Talk pydata-visualization

Your data, your insights: creating personal data projects to (re-)own the data you share

Paula Gonzalez Avalos

Data Visualization, Predictive Modelling

Your data, your insights: 3 examples to illustrate how we can apply common data science libraries together with data shared via mobile apps or collected manually to build little data visualization projects that provide unique, contextual and intmiate insights.