Software Engineer / Search and Knowledge
European Patent Office / SaM Solutions GmbH. & Co. KG
I maintained and developed ANSERA search engine for patents. I developed several custom plugins for Lucene and Elasticsearch (parsers, analyzers, tokenizers, token filters, aggregators, collectors) and data flow tools. I took part in the development of a custom query language for patent search. I did experiments to improve search relevance in the ERa (Enhanced Ranking) project using different evaluation metrics. I created REST micro-services. Currently I develop a search engine for non-patent literature (scientific papers, standards, etc. and their bibliographic data), written in different natural languages, which requires language detection, normalization and natural language specific analysis. For this purpose periodic research is done using KNIME analytics platform.
Elasticsearch, Lucene, Java 8, Java EE, Spring, Maven, Git, Jenkins, micro-services, ANTLR, Knime, Scrum
The main responsibilities are to adapt Solr for the Russian language to use in http://zaycev.net. I configured Solr for the task of high-traffic (several millions queries a day) music search and developed several plug-ins: filters, parsers, customized scoring functions. I have developed a query log analyzer, which is used for comparison of changes, done in every sprint, while improving the search quality. I improved search experience in autocomplete, spellchecking, automatic correction of transliterated text or text typed using a wrong keyboard layout. I developed language models in Python and further implemented them in Java and Scala. I adapted fingerprinting music search (a-la Shazam) for the latest version of Solr.
Solr, Lucene, Java 8, Java EE, Maven, Git, Scala, Play, Python, Continuous delivery, Scrum
South Ural State University
- Natural Language Processing
- Corpus Linguistics
- Introduction to Applied Linguistics
I was leading a group of researches, developing a Russian speech synthesis for Nao humanoid robot.
Developed Pop-Up Dictionary - a dictionary browser for more than 80 languages and a tool for learning words.
Developed different natural language processing and search tools for a wide range of languages, including corpora annotation tools, federated search engine for bibliographic data across different libraries, web scraping tools, dictionary builders, etc.
- Computational and Mathematical Linguistics, PhD student, ABD
South Ural State University, 2005-2009
- Theoretical and Computational Linguistics (English), specialist
South Ural State University, 1999-2004
- Information Systems in Economics, programmer
Chelyabinsk Polytechnic College, 1996-1999