Philip May - Data Science and IT#
I’m Philip May, data scientist expert and open source enthusiast with an NLP focus.
I come from Germany and work for Deutsche Telekom.
This website is a mixture of documentation, blog and personal notes.
Machine Learning ·
Python ·
IT ·
Linux
Blog ·
About Me
Blog#
The selection of topic-specific texts from Wikipedia - August 09, 2024
Pandas Data Format and Compression - July 02, 2024
The importance of chat templates - April 11, 2024
Pandas apply - November 18, 2023
Options for Date Encoding - October 12, 2022
Python Installation and Package Management with conda and pip - July 23, 2022
Anomalies in the MLSUM Dataset - February 23, 2022
Clean German Wikipedia Text Corpus released - February 22, 2022
LightGBM with Optuna: Demo released - February 20, 2022
German colossal, cleaned Common Crawl corpus (GC4) released - April 10, 2021
Training and Evaluation of our German Electra Language Model Talk - December 01, 2020