Prof. Dr. Alexander M. Fraser - Activities
-
Frauke Kreuter and I have a new interdisciplinary project with the Bundesbank (German Central Bank) on using NLP for analyzing corpora sustainability reports with a focus on green indicators. We are hiring for a start very soon, see here.
-
Some great work being presented at EMNLP 2023 and the Workshop on Multilingual Representation Learning. Viktor and others have a paper on using anchors and chains to learn embeddings for low-resource languages. Lavine and Alexandra have a paper on data imbalance and representation degeneration in multilingual MT. Marion and Kathy have a paper on accessing linguistic information in LLMs. Congrats!
-
I've joined the ELLIS Munich Unit, see here. ELLIS is the European Laboratory for Learning and Intelligent Systems.
-
I am a senior area chair (for Machine Translation) at NAACL 2024 in Mexico City.
-
Kathy has two papers in Findings of ACL 2023 which she will present in Toronto. The first one is on moral reasoning across languages in multilingual language models (with Jindrich and Darmstadt people in Kristian Kersting's lab), and the second one is on outlier dimensions in multilingual language models (with Alina and Jindrich). Lihong Liu, Alexandra and Hinrich have a paper at IWSLT on the copying problem in unsupervised NMT. Congratulations! See the publications page.
-
I am a senior area chair (for Machine Translation) at EMNLP 2023 in Singapore.
- Sophie has a survey paper on unbalanced distributions at EACL 2023 with William Beluch and Annemarie Friedrich, while Alexandra has a Findings of EACL paper on domain adaptation with Matt Peters and Jesse Dodge, and a paper at the Workshop on Technologies for Machine Translation of Low-Resource Languages (at EACL), with Dario. Congratulations and have fun presenting!
- Viktor Hangya's DFG proposal on transfer learning for hate speech detection was accepted, congratulations! Viktor is hiring.
- I received an ERC Proof of Concept grant "Data for Multilingual Learning (DATA4ML)"! This will create a real world demonstrator for the work we did in my ERC Starting Grant on Domain Adaptation. The main focus here is on creating a tool to help minority language activists find and create multilingual data for training machine translation systems and language models for their languages, but our work will enable the creation of domain-specific models for companies as well. Here is the press release in English and German.
- I was interviewed by the Handelsblatt newspaper about Facebook's open source AI strategy, which I strongly support. We have used Facebook models in many of our research projects, such as studies on cultural moral norms in LLMs and on transfer learning for cross-lingual hatespeech detection. Excerpts from the interview (in German).
- The journal version of our paper on cross-lingual transfer learning for hate speech detection has been published, congratulations Irina, Viktor and Iryna!
- My research group is offering MA and BA topics in WS 2022/2023, please read this page carefully
-
Lavine and Alexandra have a paper in Findings of EMNLP on using adapters for multilingual multi-domain NMT. Viktor and Saadi have a paper at EMNLP on improving low resource languages in pretrained multilingual language models. Saadi, Viktor and Tobias Eder have a paper at the Multilingual Representation Workshop at EMNLP on comparison of cross-lingual contextual embeddings. Sophie also has a paper on some previous work on modals in Findings of EMNLP. Nice work everyone!
-
I'll be one of the two senior area chairs for Multilinguality (with Ivan Vulic) at EACL 2023 (Dubrovnik/hybrid, this was only recently announced because it was previously planned for Kyiv). The deadline is soon, please submit your excellent research.
-
Congratulations to Lavine and Jindrich for a nice paper at Coling 2022 on domain robustness and domain adaptability.
-
Sotra, the Sorbian translator now handles Lower Sorbian as well! Dario and I were both involved in the modeling work. Here is a news video and a short press release (in Lower Sorbian and German)!
-
The ERC published a short article about my ERC Starting Grant, appropriately enough in 6 languages. See here.
- Demo at VLDB with Carsten Binnig's group on conversational agents. See Gassen et al, here.
-
Keynote at BUCC 2022, attending LREC in Marseille was fun. Here are the slides, the paper on Pseudo Parallel Corpora for OOVs, Huck et al 2019, and Hangya et al 2021, the paper on Translating Unseen and Rare Senses.
-
I gave an invited talk at the workshop AI in Science: Foundations and Applications on June 9-10, 2022 at the MCMP at LMU Munich. The topic was "Some Open Problems in (Multilingual) Natural Language Processing". Here are the slides.
-
Looking forward to hearing the official word soon about the Munich Center for Machine Learning (MCML) phase 2, in which I am a PI. This is an important focus of a wide range of efforts in machine learning in the Munich area. It is a joint effort by LMU and TUM.
-
Jindrich's paper on neural string edit distance will appear at the ACL workshop on structure prediction! See publications.
- New technical report with Kristian Kersting's group on moral norms and multilingual models. Click here for the report.
-
Katharina and Jindrich both have papers in Findings at ACL, congratulations! Katharina's (with Jindrich) is on combining the strengths of static and contextualized models, while Jindrich's (with long-time collaborator Helmut Schmid) explains why character-based MT models are not being used. The works will also be presented as posters at ACL. See publications.
- Report on arXiv by Marion and Matthias, on comparing our two approaches to dealing with target-side morphology in NMT. The two approaches are based on morphological analysis and on stemming. Experiments are on German and Czech. We hope this research will influence future work on translating into other morphologically languages. Here is the report.
- "Understanding Sorbian" on 21 Feb 2022 (International Mother Language Day!): writing about our first (unsupervised) Upper Sorbian/German system and the resonance of the shared tasks on Upper Sorbian and Lower Sorbian that we have organized at ACL-WMT. LMU Newsroom article in German, in English.
- New technical report on cross-lingual detection of hate speech with Irina, Viktor and Iryna Gurevych, click here.
- Lisa Woller and Viktor had a paper on Occitan embeddings at the 2021 Multilingual Representation Learning workshop, see the paper here, nice work!
- WMT 2021: Viktor has a paper on multi-sense words as part of our Cambridge collaboration (together with Flora Liu, Dario, Anna Korhonen), Jindrich organized a shared task and has a system paper, and Lavine has a system paper (with Jindrich), see publications, congrats!
- Irina, Alexandra, Tobias and Simon have first author papers at LT-EDI, NAACL, ACL, and MT Summit, see publications, congratulations to them and to their coauthors!
- Dr. rer. nat. Dario Stojanovski successfully defended his PhD thesis on "Modeling Contextual Information in Neural Machine Translation"! Thank you Philipp Koehn (JHU) and Rico Sennrich (Zurich) for virtually travelling to Munich to serve on the committee.
- We are organizing the WMT 2021 shared task on Unsupervised MT and Very Low Resource Supervised MT. The tasks are to translate Upper Sorbian and Lower Sorbian to/from German, as well as to translate Chuvash to/from Russian. Please participate! More information here.
- Dario presented a key paper for his PhD research showing that document-level context allows domain adaptation to zero-resource domains (domains which did not appear in the training data).
- The SOTRA translator for translation of Upper Sorbian to and from German went live! Here is a link to the press conference (in German with some Upper Sorbian), where I briefly talked about planned future work on the system.
- I was invited to participate in the first meeting of the ELLIS Society Natural Language Processing initiative for two days in February 2021, it was very interesting!
- NEW: all topics are taken, thanks for your applications. [My research group is offering MA and BA topics in SS 2021, please read this page carefully.]
- New papers at EMNLP, COLING and WMT! See publications.
- Viktor is presenting new work on dealing with many-to-one and one-to-many translations (i.e., compositionality) at AMTA 2020. See publications.
- NEW: all topics are taken, thanks for your applications. [My research group is offering MA and BA topics in WS 2020/2021, please read this page carefully]
- I held a talk on recent research at the Faculty of Mathematics, Informatics and Statistics (here at LMU Munich), see here for a video.
- Marion Weller-Di Marco's paper on Modeling Word Formation in English-German Neural Machine Translation appeared at ACL. See publications.
- Silvia Severini and Viktor Hangya (in joint work with Hinrich and myself) had the best results in the Russian-English shared task at BUCC 2020. See publications.
- Congratulations to Leah Michel and Viktor Hangya on a study of the challenges of using standard post-hoc-mapped Bilingual Word Embeddings for the low-resource language Hiligaynon at LREC 2020. See publications.
- We are organizing the WMT 2020 shared task on Unsupervised Machine Translation and Very Low Resource Machine Translation. The tasks are to translate from German to Upper Sorbian, and from Upper Sorbian to German. Please participate! More information here.
- My research group is offering MA and BA topics in SS 2020, please read this page carefully.
- Veronika Hintzen presented on email analytics and Costanza Conforti presented as a GSCL finalist at Konvens 2019 in Erlangen, see publications.
- Congratulations to Jindřich Libovický for winning the Dimitris Chorafas prize for the best hard science PhD thesis in 2019 at Charles University Prague.
- Congratulations to Costanza Conforti for being a finalist for the GSCL award for her work with Matthias and me on Rich POS tagging on a lemma representation.
- Welcome to Alexandra Chronopoulou, Jindřich Libovický, and Marion Di Marco (Weller-Di Marco)!
- On a panel on the future of machine translation at the ACL Conference on Machine Translation 2019 (WMT) with moderator Ondrej Bojar, other panelists Alon Lavie, Marcin Junczys-Dowmunt, Yvette Graham, should be fun.
- Lots of new research papers to be presented in summer 2019, including curriculum training, document MT, unsupervised MT, unsupervised parallel sentence extraction, bilingual lexicon induction for MT OOVs, multilingual POS tagging, see publications. Thanks team!
- My research group is offering MA and BA topics in WS 2019/2020, please read this page carefully.
- We are co-organizing the ACL 2019 Conference on Machine Translation Unsupervised Shared Task, please participate!
- We are offering MA and BA topics in SS 2019, please read this page carefully (see at the bottom of the page, it is only accessible to CIS students)
- Post-doc position available, click here.
- PhD position available, German required, click here.
- New research paper at the International Workshop on Spoken Language Translation (IWSLT) from Viktor, Fabienne and Yuliya on unsupervised extraction of parallel sentences from comparable corpora. See publications.
- I gave a keynote on translation to morphologically rich languages at Baltic Human Language Technologies 2018 in Tartu at the end of September.
- New research paper at the conference on machine translation (WMT) from Dario on using oracles to explore context NMT. Dario, Matthias and Viktor also have three new WMT shared task papers (which are unsupervised, bio/news and filtering). See publications.
- Congratulations to Dr. Matthias Huck for the successful defense of his PhD thesis at RWTH Aachen!
- One post-doc and two PhD student positions are available, click here.
- Two new papers at ACL, one on domain adaptation of bilingual tasks by Viktor and Fabienne, and one from Philipp Dufter and others on massively multilingual embeddings. See publications.
- Congratulations to Dr. Anita Ramm for the successful defense of her PhD thesis!
- Two new papers, one from Costanza and Matthias on neural POS-tagging of lemma sequences at AMTA 2018, and one from Fabienne, Viktor, and Tobias on bilingual lexicon induction of rare words at NAACL 2018. See publications.
- The Health in My Language (HimL) project (funded by EU Horizon2020) is coming to a close, was a great experience working with our user partners Cochrane and NHS24, along with Edinburgh, Lingea and Prague. This blog post looks at Cochrane's use of post-editing of our systems' output.
- If you need to explain machine translation to a German-speaking 8-year-old, check out my interview on the German children's TV show "Erde an Zukunft", which was a lot of fun. Pick "Eine Sprache für alle", here (it is no longer on page 1, go to page 2).
- Congratulations to Marion Weller-Di Marco for the successful defense of her Ph.D. thesis!
- New job: As of September 1st, I am Professor of Information and Language Processing (CIS, LMU Munich).
- Congratulations to Matthias Huck and Fabienne Braune for their neural machine translation system for English to German translation, which had the best results according to human evaluation at WMT! See the system paper in publications for more details.
- New Papers: Matthias Huck and Simon Riess, Aleš Tamychyna and Marion Weller-Di Marco have two papers about dealing with rich morphology on the target-side in neural machine translation (looking at linguistic segmentation and tag-lemma representation respectively), which were accepted to WMT. Valentin Deyringer has a paper on training with Hogwild! at MTM and Leonie Weißweiler has a paper on German stemming at GSCL. See publications for more details.
- The proceedings of the research track at EAMT (European Association for Machine Translation) 2017, which is in Prague, are out. There are some great research posters, here are the boaster slides (2 slides per poster).
- Congratulations to Anita Ramm and Costanza Conforti for accepted papers (two papers on verbal morphology and preordering and a paper on Venetan respectively)! See publications for more details.
- I'll be the program chair (research track) at EAMT (European Association for Machine Translation) 2017, which will be in Prague.
- Congratulations to Marion Weller-Di Marco, and to Matthias Huck and Aleš Tamchyna, for papers about translation to morphologically rich languages, which were accepted to EACL! See publications for more details.
- One post-doc and one PhD student position are available, click here (deadline has now passed)
- I was interviewed in the Süddeutsche Zeitung (newspaper), see here.
- Congratulations to Fabienne Braune for winning the EAMT 2015 Best Thesis Award, which she collected in Riga! For more information, click here.
- I gave a keynote on translation to morphologically rich languages at the European Association for Machine Translation (EAMT 2016) in Riga. Here are the slides.
- Munich hosted the May 2016 meeting of the Health in my Language (HimL) project.
- Thanks to Prof. Francois Yvon of LIMSI, who was a Visiting Fellow at the LMU Center of Advanced Studies and held a very interesting talk on cross-lingual transfer, as well as participating in much of the HimL meeting (and giving a talk there as well on his collaboration with HimL partner Cochrane! See his LREC paper).
- I was profiled in University of Munich's "Einsichten" Magazine (German), click here.
- Welcome to Matthias Huck who just joined CIS to work in both HimL and the ERC project!
- Welcome to Dr. Liane Guillou who just joined CIS to work on the Health in my Language project!
- We found students for all seven Bachelors and Masters topics we offered for March starting dates (see below), thanks for your applications
- Welcome to Dr. Tsuyoshi Okita who just joined CIS to work in the ERC project!
- Congratulations to Dr. Fabienne Braune on a successful PhD defense! Her topic: Decoding Strategies for Syntax-based Statistical Machine Translation. Her next step: transition to post-doc work on both the ERC project and HimL.
- Congratulations to Dr. Fabienne Cap for receiving the VINNMER Marie Curie Incoming Stipendium in the context of the VINNOVA "Mobility for Growth" Programmes. We are looking forward to collaborating with Fabienne C. on her new research on multi-word-entities.
- I am hiring post-docs in both the ERC Starting Grant and in the HimL project, see here for SMT and NLP backgrounds and here for deep learning / general machine learning backgrounds.
- I will be an area chair for Machine Translation and Multilinguality at EMNLP 2016, which will be in Austin.
- I will be an area chair at NAACL 2016, which will be in San Diego.
- Fabienne Braune presented our paper on discriminative rule selection for string-to-tree translation at EMNLP.
- Our paper on combining lemmatization and morphological rich POS tagging into a joint model at EMNLP received an honorable mention for the best short paper award.
- Martin Wunderlich went to GSCL in Essen and presented our paper on a proof-of-concept system for WSD of Old English.
- Ryan Cotterell presented our paper on labeled morphological segmentation at CoNLL.
- Marion Weller presented our paper on generating prepositions at EAMT in Antalya. She will also present the work at SSST in Denver, here is the extended abstract accepted there.
- I've been awarded a European Research Council Starting Grant! The press release is available from the university web page: in German, English. The agreement has now been signed by the EC. This project will have a large team with a wide breadth, requiring research expertise ranging from morphological modeling for statistical machine translation, through several forms of domain adaptation, to user interfaces. Stay tuned!
- Our Horizon2020 Innovation Action Health in my Language (HimL) (pronounced like Himmel) was funded (together with U. Edinburgh, Charles U. Prague, Cochrane, Lingea, NHS 24). We kicked off with a meeting in Edinburgh in Feb 2015. This involves an exciting application for the linguistically-aware statistical machine translation systems we've been working on, applied in the translation pipelines of two innovative non-profit health organizations.
- Congratulations to Fabienne Cap on defending her PhD! Her topic: Morphological Processing of Compounds for Statistical Machine Translation.
- I taught a two-week course on Information Extraction at the DAAD International Summer School for Advanced Language Engineering (ISSALE) at the University of Colombo School of Computing, Sri Lanka. Thanks to a great class of people from Nepal, Pakistan and Sri Lanka, and to all of the organizers!
- The German Society for Computational Linguistics has selected Nadir Durrani's dissertation as the best language technology / computational linguistics dissertation for the years 2011 to 2013. The thesis presents the operation sequence model, a new statistical machine translation model that integrates translation and reordering operations into one unified sequence model. Congratulations to Nadir for winning this prestigious award!
- Ryan Cotterell of Johns Hopkins University, a PhD student of Jason Eisner and David Yarowsky, has been awarded a Fulbright student scholarship to spend a year at CIS and conduct research on computational morphology under the supervision of Alex Fraser and Helmut Schmid.
- Marion Weller presented a paper on using noun word classes to generalize preposition selection at AMTA in Vancouver.
- Nadir Durrani presented a paper on generalizing the representation of words in both phrase-based and operation-sequence-model models at COLING in Dublin.
- A group of Stuttgart and Munich people wrote a paper on examining the effect of compositionality of compounds in translation. This was presented at the First Workshop on Computational Approaches to Compound Analysis (ComaComa) which is part of COLING in Dublin.
- Marion Weller presented our paper on using automatically mined bilingual terminology for domain adaptation at EAMT in Dubrovnik.
- We participated in the shared task at WMT 2014 for translating English to German and our system did well, see the paper, which includes a comparison of several of our linguistically-motivated sub-systems.
- Fabienne Cap presented our paper on producing German compounds in English to German Translation at EACL in Göteborg.
- We integrated Vowpal Wabbit into the open source Moses statistical machine translation system, and the paper is in the Prague Bulletin of Mathematical Linguistics. This is one of the outcomes of the Johns Hopkins Workshop on Domain Adaptation.
- I was a visiting professor at the Universität Heidelberg in Summer Semester 2014, representing the professorship in Linguistic Computer Science (Linguistische Informatik). I am back in Munich for Winter Semester 2014/2015.
- Our Dagstuhl Seminar on Statistical Techniques for Translating to Morphologically Rich Languages was very interesting. It was organized in part by the Morphosyntax for Statistical Machine Translation project. The organizers were Alexander Fraser, Kevin Knight, Philipp Koehn, Helmut Schmid, and Hans Uszkoreit. The seminar took place in the first week of February 2014.
- We did well at the 2013 shared task on machine translation, see our three shared task papers
- The paper Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT? was presented by Nadir Durrani at ACL 2013
- The paper Using Subcategorization Knowledge to Improve Case Prediction for Translation to German was presented by Marion Weller at ACL 2013
- The paper Model With Minimal Translation Units, But Decode With Phrases was presented by Nadir Durrani at NAACL 2013
- The second phase of the Morphosyntax for SMT project was approved by the German Research Foundation (DFG)!
- I gave a keynote talk at TALN 2013 on translating to morphologically richer languages
- I have moved to the CIS at the University of Munich (LMU München), but am still often in Stuttgart at the IMS.
- Our Dagstuhl Seminar on Statistical Techniques for Translating to Morphologically Rich Languages has been accepted, which is great news for the Morphosyntax for SMT project. The organizers are Alexander Fraser, Kevin Knight, Philipp Koehn, Helmut Schmid, and Hans Uszkoreit. The seminar will take place in February 2014. Sixty top researchers will be invited to have free-ranging discussions of how to move forward on this important problem. (For more information on Dagstuhl Seminars in general, click here)
- Nadir Durrani passed his defense, congratulations! His topic: A Joint Translation Model With Integrated Reordering. He has accepted a position with Philipp Koehn at University of Edinburgh.
- I will serve as a tutorials chair at ACL 2014 in Baltimore, looking forward to your applications.
- Congratulations to Hassan Sajjad for passing his defense! His topic: Statistical Models for Unsupervised, Semi-supervised and Supervised Transliteration Mining. He has accepted a position with Stephan Vogel and Preslav Nakov at the Qatar Computing Research Institute.
- I was an organizer of the Johns Hopkins Summer Workshop on Domain Adaptation for Machine Translation. This involved 6 weeks of intensive work on taking advantage of out-of-domain data in translation, conducted by 13 researchers on location at the Center for Language and Speech Processing at the Johns Hopkins University in Baltimore.
- Paper accepted to Computational Linguistics Journal: Knowledge Sources for Constituent Parsing of German, a Morphologically Rich and Less-Configurational Language, by Alexander Fraser, Helmut Schmid, Richard Farkas, Renjing Wang, Hinrich Schuetze (abstract   preprint)
- I taught Statistical Machine Translation at the Summer School in Advanced Language Engineering at Kathmandu University in Nepal in September 2012. This was supported by a DAAD grant. We had participants from Nepal, Pakistan and Sri Lanka (participants from Bangladesh and Bhutan were also invited, but were unable to attend).