Main Publications
Please see the complete list of publications below for full author lists. (Auto-generated publication list from the DiVA repository is also available)
|
Chair Professor of Internetworking at the KTH Royal Institute of Technology. Associated with the Connected Intelligence unit of RISE (Research Institutes of Sweden), and a Wallenberg AI, Autonomous Systems and Software Program (WASP) faculty member. Main research interests are: Distributed Systems, Computer Networks, Operating Systems, and Mobile Computing.
News:
|
I will be recruiting postdocs and doctoral students for my Wallenberg Scholar Project titled "Sustainable and Adaptive Inferencing for Democratizing AI". There is already a job ad published for three doctoral positions in Inferencing for Large Language Models. You can find out more about the project at the KAW website.
My recent focus is on using machine learning for systems, and building systems for machine learning. One key example is our project on Scalable Federated Learning (SFL). This project aims to develop a highly scalable, flexible, extensible, distributed federated machine learning approach that can directly benefit public health and wellness. More details are available in the SFL blog. In our recent CFD paper we demonstrate our ability to perform dropout on image recognition models using coding theory (e.g., using Gold codes), resulting in more than 2X network bandwidth savings when used for Federated Learning. Our DeepGANTT IPSN '23 paper demonstrates our ability to apply transformers to graph neural networks for scaling up an IoT scheduling problem 10X beyond what a constraint optimization solver can solve in a reasonable time.
I remain committed to working on low-latency networked systems, with my biggest effort in the 2018-2024 period being my ERC Consolidator Project called ULTRA. In this single-PI, 2-Million EUR project, we want to dramatically change the way Internet Services are constructed. In our SoCC 2018 paper, we present Kurma, our fast and accurate load balancer for geo-distributed storage systems. We have recently collaborated with Ericsson on our CacheDirector work published at EuroSys 2019. The work on Metron (NFV service chains at the true speed of the underlying hardware) has led to, and has inspired this project. Our most recent CoNEXT 2019 paper called RSS++shows how load-balancing with Receive Side Scaling (RSS) can be improved for large increases in efficiency. In our NSDI 2020 paper we have introduced Cheetah, a new load balancer for solving the difficult challenge of remembering which server is serving which connection, without the tradeoff between uniform load balancing and efficiency. In our ASPLOS 2021 paper, we have shown that vertically integrating a network protocol stack enables a single CPU core to forward packets faster than 100 Gbps. More details are available at the PacketMill project page. The Metron journal article is now also available, and it shows how high performance NFV service chaining can be done even in the presence of blackboxes. Our NSDI 2022 paper Packet Order Matters shows a surprising result: deliberately delaying packets can improve the performance of backend servers by up to a factor of 2 (e.g., those used for Network Function Virtualization)! We show three different scenarios in which our Reframer can be deployed.
In our OSDI 2020 paper on Assise, we provide a blueprint of how a next generation distributed file system should be built on top of NVRAM that is becoming widely available, and achieve large performance gains. Our LineFS paper (Best Paper award at SOSP '21) builds upon this work by offloading CPU-intensive tasks to a SmartNIC (BlueField-1 in our case) for about 80% performance improvement across the board. Our RedN NSDI 2022 paper shows another suprising result, namely that Remote Direct Memory Access (RDMA), as implemented in widely deployed RDMA Network Interface Cards, is Turing Complete! We leverage this finding to reduce the tail latency of services running on busy servers by 35x!
A major focus area for my research group was on Time-Critical Clouds, a 2016-2022 project supported by SSF (the Swedish Foundation for Strategic Research) with 27 M SEK (~2.7 M EUR). This was a joint effort with the Connected Intelligence group of RISE AB. Our first major contribution is a Eurosys 2017 paper (link to video and paper available here). In this work we reduce the tail latency in key-value stores by up to 1.9x by scheduling multiget requests more efficiently. Most recently we have shown how to run NFV service chains at the true speed of the underlying hardware in our NSDI '18 paper. In our EuroSys 2019 paper we have unlocked a performance-enhancing feature that existed in Intel processors for almost a decade. In our USENIX ATC 2020 paper, we are reexamining Direct Cache Access (DCA) to optimize I/O intensive applications for multi-hundred-gigabit networks. In our PAM 2021 paper, we show that the forwarding throughput of the widely-deployed programmable Network Interface Cards (NICs) sharply degrades when i) the forwarding plane is updated and ii) packets match multiple forwarding tables in the NIC.
We have also concluded the work on the PROPHET ERC project (2010-2016), in which we aimed to dramatically change the way networked systems are developed and deployed. For example, we improved the performance of geo-replicated storage systems using GeoPerf [SOCC '15]. We have successfully applied software verification techniques to increase the reliability of Software-Defined Networks (SDN). Some of our key contributions to testing of OpenFlow networks are NICE [NSDI'12] and SOFT [CoNEXT'12]. We have identified serious issues in the interplay between the control and data planes in OpenFlow switches [PAM '15], and proposed an approach for verifying rule installation [CoNEXT '14] as well as fine-grained dynamic monitoring of switch dataplanes [CoNEXT '15]. Extended versions of these contributions are now available as IEEE/ACM TON and Elsevier Computer Networks journal publications. My work in Wallenberg AI, Autonomous Systems and Software Program (WASP) as a WASP faculty (advising an industrial PhD student at Ericsson, Amir Roozbeh) is complementary to these efforts.
We have wrapped up our work in the BEhavioral-BAsed Forwarding (BEBA) Horizon2020 project (2014-2017) that aimed to reshape Software-Defined Networks. Our contributions are described in George Katsikas' licentiate thesis, and involve deep understanding and performance optimization of Network Functions Virtualization (NFV) service chains. Moreover, our recent work on Synthesized Network Functions, demonstrates high throughput with low predictable latency on a single commodity server thanks to its highly synthesized code and request dispatching. The overall project was recently highlighted by the EU comission.
Please see the complete list of publications below for full author lists. (Auto-generated publication list from the DiVA repository is also available)
"Profiling and Accelerating Commodity NFV Service Chains with SCC", Georgios Katsikas, Gerald Q. Maguire Jr., and Dejan Kostic, The Journal of Systems & Software, 2017.
"SNF: Synthesizing high performance NFV service chains", Georgios Katsikas, Marcel Enguehard, Maciej Kuzniar, Gerald Q. Maguire Jr., and Dejan Kostic, PeerJ Computer Science, 2016.
"Systematically Testing OpenFlow Controller Applications", Peter Peresini, Maciej Kuzniar, Marco Canini, Daniele Venzano, Dejan Kostic, and Jennifer Rexford, Computer Networks: The International Journal of Computer and Telecommunications Networking, Elsevier, 2015.
"Predicting and Preventing Inconsistencies in Deployed Distributed Systems", Maysam Yabandeh, Nikola Knezevic, Dejan Kostic, and Viktor Kuncak, ACM Transactions on Computer Systems (TOCS), Volume 28, Issue 1 (March 2010). Pages: 2-49.
"Towards a Cost-Effective Networking Testbed". Nikola Knezevic, Simon Schubert, and Dejan Kostic, SIGOPS Operating Systems Review, Volume 43, Issue 4 (January 2010), Pages: 66-71.
"High-Bandwidth Data Dissemination for Large-Scale Distributed Systems", Dejan Kostic, Alex C. Snoeren, Amin Vahdat, Ryan Braud, Charles Killian, James W. Anderson, Jeannie Albrecht, Adolfo Rodriguez, and Erik Vandekieft, ACM Transactions on Computer Systems (TOCS), Volume 26 , Issue 1 (February 2008), Pages: 3-61.
I am advising several doctoral students at KTH:
Several of my students have already defended their PhDs:
Some of my students at KTH have already defended their licentiate theses (a degree half-way to the doctoral degree in Sweden):
My Full CV contains the list of Master projects that were supervised and/or examined by me:
At KTH, I teach (or have taught):
Dejan Kostic obtained his Ph.D. in Computer Science at the Duke University. He spent the last two years of his studies and a brief stay as a postdoctoral scholar at the University of California, San Diego. He received his Master of Science degree in Computer Science from the University of Texas at Dallas, and his Bachelor of Science degree in Computer Engineering and Information Technology from the University of Belgrade (ETF), Serbia. From 2006 until 2012 he worked as a tenure-track Assistant Professor at the School of Computer and Communications Sciences at EPFL (Ecole Polytechnique Federale de Lausanne), Switzerland. In 2010, he received a European Research Council (ERC) Starting Investigator Award. From 2012 until June 2014, he worked at the IMDEA Networks Institute (Madrid, Spain) as a Research Associate Professor with tenure. He is a Professor of Internetworking at KTH since April 2014. In 2017, he received a European Research Council (ERC) Consolidator Award. In 2024, he was named a Wallenberg Scholar.
d m k <at> k t h <dot> s e
Office phone# +46 8-790 42 65
Prof. Dejan Kostic
KTH Kista
Kistagangen 16
164 40 Kista
Sweden
My office is 4401 in the Electrum Building on the KTH Kista campus, East side, entering from Elevator B on the 4th floor. Approximate coordinates (on Google Maps): 59.404850, 17.949922
The best way to enter the Electrum building is from Kistagangen 16, 164 40 Kista, Sweden. Another, lower and harder-to-find, entrance is Isafjordsgatan 26, 164 40 Kista, Sweden.
Getting here from the Arlanda Stockholm airport: a convenient way of getting to KTH Kista is by catching the suburban train from the Arlanda airport (but NOT the Arlanda express train!) to the Helenelund Train Station. You need to go to Arlanda C in Terminal 5 to board the train, and please expect to pay an airport supplement (85 SEK, I think but prices are gradually increasing). Example google maps itinerary from the airport (entrance to the Electrum building is a bit inconspicuous, through the sliding doors).
Getting here from Stockholm downtown: taking the Blue Line metro toward Akalla and getting off at Kista T-Bana (next to the Galleria shopping mall) is the best option. Then you follow the signs for Kistamassan, going up the street called Kistagangen. You will reach KTH Kista very quickly (and will not get to Kistamassan itself).
I love taking Stockholm Photos, and my
larger portfolio is here: https://pixels.com/profiles/dejan-kostic?tab=artwork.
You can also follow me (dmkostic) on Instagram and Twitter. My LinkedIn profile is here.