Professional experiences
Projects
■ 01.2024 - now ■ Data Consultant at Capgemini Invent
○ Generative AI Projects:
Focused on exploring and implementing cutting-edge, State-of-the-Art methodologies.
Developed virtual assistant software for Question-Answering systems, specializing in medical queries.
Built virtual assistant software to process complex audit documents and reports, incorporating features such as question-answering, summarization, template-based proofreading, and document comparison.
○ Technical Keywords: Chatbot, Retrieval-Augmented Generation, Large Language Models, Generative AI, Unstructured, PyMuPDF, Streamlit, GCP (Document AI), Azure (AI Search, OpenAI, AI Services etc)
○ Fields: Airline Industry, Medical Industry, R&D
■ 04.2021 - 12.2023 ■ Data Scientist/Researcher at Quantmetry
○ R&D in Generative AI:
Developed Qolmat, an open-source library for imputing tabular and time-series data using generative models.
Implemented data augmentation techniques to improve service evaluation in the MAPIE project and contributed to quality evaluation of generated data.
Engineered generative models for synthetic designs with constraints and performed quality evaluations of generated data.
○ Technical Keywords: Data Augmentation, Generative Models, Generative Adversarial Network, Variational Auto-Encoder, Diffusion Models, Transformers, Trusted AI, Generative AI, NLP, Tabular Data, Time Series
○ Fields: Automotive Industry, R&D
■ 10.2017 - 02.2021 ■ Generative Probabilistic Alignment Models for Words and Subwords ○ NLP (Utility)
○ Researched alignment models for low-resource languages, focusing on subword tokenization and variational inference techniques.
○ Doctoral Project: 3 years in Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI)
○ Technical Keywords: Low-resource Languages, Variational Auto-Encoder, Subword Tokenization, Expectation-Maximization, Hidden Markov Models, Convolutional Neural Networks, Long Short-Term Memory Networks, SentencePiece, Giza++, Fastalign, Tensorflow, Pytorch, Python, Slurm
■ 03.2017 - 09.2017 ■ Neural Machine Translation ○ NLP (Utility)
○ Developed GAN-based Neural Machine Translation systems and explored reinforcement learning applications for improved translation quality.
○ Internship: 6 months in Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI)
○ Technical Keywords: Generative Adversarial Network, Reinforcement Learning, Convolutional Neural Network, Recurrent Neural Network, Python
○ Github: GAN-NMT
■ 09.2016 - 11.2016 ■ Axa Data Challenge ○ Data Challenge of DaSciM
○ Developed predictive models to address concept drift in time-series data.
○ Scholarly Project: 3 months in École Polytechnique (X)
○ Technical Keywords: Concept Drift, Tree Regressors, Random Forest Regressors, Gradient boosting regressors, Autoregressive models, Python, Scikit-Learn, Anaconda
■ 01.2016 - 03.2016 ■ Brain MRI Segmentation ○ Software Development (Medical)
○ Developed software for segmenting brain MRIs using 3D slicer and ITK.
○ Internship: 3 months in Kyoto Institute of Technology (KIT)
○ Technical Keywords: C++, 3D Slicer, ITK
○ Youtube: Demo
■ 06.2015 - 08.2015 ■ Project KList ○ iOS Development (Entertainment)
○ Internship: 3 months in NinePoints Co. Ltd
○ Technical Keywords: Swift, ObjectiveC, Xcode, Realm, Alamofire
○ Appstore: Klist
■ 02.2015 – 06.2015 ■ Project 3D Movie ○ 3D Development (Entertainment)
○ Graduation Project: 4 months in Arena Multimedia Institute (AMI)
○ Technical Keywords: Maya 3D
○ Youtube: The Ant and The Grasshopper
Competences
■ Machine Learning
○ Large Language Models, Retrieval-Augmented Generation
○ Variational Auto-Encoder, Generative Adversarial Networks, Diffusion Models
○ Convolutional Neural Networks, Recurrent Neural Networks, Transformers
○Support Vector Machines, Back-propagation, Naive Bayes Classifier, Linear Regression, K-Nearest Neighbor, Decision Trees
○ Expectation-Maximization, DBSCAN, K-Means, Hierarchical Clustering
○ Boosting Algorithms, XGBoost, LightGBM
○Pytorch, Theano, TensorFlow, Scikit-Learn, LibSVM, Weka, MLFlow, FastAPI, Slurm, AWS, Google Cloud, Azure, Jupyter Lab, DeepNote
○ Transformers, Spacy, SentencePiece, Spark, Giza++, Fastalign, Simalign, Eflomal
■ Software development ○ Python, Scala, R, Matlab ○ Java (J2EE), C/C++, C# ○ OpenCV, Qt ○ Lisp, Prolog, Jess ○ Netbean, Eclipse, Visual Studio, Anaconda, Spyder
■ Mobile development ○ Objective C, Swift, React ○ iOS, Android ○ XCode, Android Studio
■ Web development ○ HTML5, CSS3, JavaScript ○ PHP ○ jQuery ○ Dreamweaver ○ Adobe Flash
■ Multimedia development ○ Illustrator, CorelDraw ○ InDesign ○ Photoshop, Lightroom ○ Premiere Pro, After Effects ○ 3ds Max, Maya 3D
■ DB Management ○ SQL ○ MySQL, Microsoft SQL Server ○ Microsoft Access
■ Architectural Design ○ UML
■ Project management ○ Project planning and roadmapping, Agile methodologies, Documentation development ○ Trello, Microsoft Office Suite