The Center for Data Science seminar series is centered around intellectual exchange and interaction, and the audience is encouraged to ask questions during presentations. The goal is a seminar that looks less like a lecture and more like a spirited discussion of issues raised in a relatively brief presentation of a paper or a research project. Besides the following seminars, we are also collaborating with the Information Technology department on Computer Literacy Seminars.
We are looking forward to the seminars coming in 2022. If you are interested in delivering a research talk or learning more about our seminar series, please contact Juhua Hu.
Upcoming Seminars in 2022
|Title||Date of Presentation||Speaker||Affiliation||Research Focus|
|MSCSS Thesis/Capstone||Wednesday, Jun 1, 2022 |
10:10 to 15:40 PM
|MSCSS Students||Computer Science and Systems, School of Engineering and Technology, UW Tacoma||10:10-10:40 Kevin Ewig: Enhanced Early Sepsis Onset Prediction: A Multi-Layer Approach [Supervised by Juhua Hu, Ankur Teredesai, and Katherine Stern (UW Medicine)]|
10:40-11:00 Himanshu Thakur: Early Sepsis Onset Prediction: An Interpretable CNN Approach [Supervised by Juhua Hu, Ankur Teredesai, and Katherine Stern (UW Medicine)]
13:30-13:50 Brad Luong: Backward DGA Deep Model Compression in Dense Layers [Supervised by Juhua Hu and Vadym Tymchenko (Infoblox)]
15:10-15:40 Sijin Huang: Sets of Sub-Sequences based Sepsis Prediction for ICU Trauma Patients [Supervised by Ankur Teredesai, Juhua Hu, and Katherine Stern (UW Medicine)]
Co-organized with WiDS Tacoma @ UW Tacoma, RSVP Today!
Past Seminars in 2020-2022
|Title||Date of Presentation||Speaker||Affiliation||Research Focus|
|Variation in Outcomes of Obstetric Patients in Washington State||Wednesday, May 18, 2022|
12:30 – 1:30 PM
|Jeff Lumpkin||leader of Blue Collar BI, a consulting company focused on helping organizations use data effectively||Jeff will present the work he has done with the Foundation of Healthcare Quality, which focuses on maternal and infant outcomes from 2016 – 2021. The dataset contains 169,000 records, each with 500 rows of data. It is one of the richest obstetric datasets in the country. In order to facilitate analysis of such a rich source of information, Jeff has developed some pioneering techniques in the organization and display of information and has partnered data scientists to include hospital and physician level risk adjustments.|
If missed, here is the recording.
|Challenges, Opportunities and Pitfalls of ML Fairness ||Wednesday, May 11, 2022|
12:30 – 1:30 PM
|Muhammad Aurangzeb Ahmad||Principal Research Scientist at KenSci & Affiliate Assistant Professor of Computer Science at UW Bothell||The issue of fairness in machine learning in healthcare has become prominent in recent years. Fairness and equity related issues in AI are multifold: AI models may discover spurious causal correlations which may end up harming vulnerable communities, unfair resource allocation, under-diagnosis of minority groups, discrimination via automation etc. This talk will focus on the various challenges that one encounters in engineering fairness in machine learning systems especially in the context of healthcare. We will also discuss how even systems explicitly designed to be fair can lead to unfair outcomes if proper processes are not in place. The talk will be interspaced with real world examples of engineering fairness, case studies, engineering processes and how being cognizant of limitations of algorithmic fairness can help us build more equitable systems.|
Co-organized with IT as part of the Computer Literacy Seminars. If missed, here is the recording.
|AI Fairness – AIF360 Toolkit||Wednesday, May 4, 2022 |
12:30 – 1:30 PM
|Kiran Tejaswi (Teja) Alluru, President of AI Club||MSBA, Milgard Business School, UW Tacoma||The main objective of this session is to understand that while good performance and accuracy are essential, it is equally important for an AI system to be fair, transparent, and trustworthy for technological acceptance. Here we discuss what is AI Trust. What are the different aspects that constitute AI trust and what is AI fairness? Why fairness plays a pivotal role now more than ever in the field of AI. Real-world examples where AI has failed in the aspect of fairness. Then we discuss the AIF360 toolkit and how we can use this toolkit to identify bias and mitigate bias in datasets. Then we will give a short demo of the python codebook applying the AIF360 algorithms to mitigate bias in datasets|
Co-organized with IT as part of the Computer Literacy Seminars. If missed, here is the recording.
|Unsupervised ML for Onset Prediction of Convection Transitions||Wednesday, April 13, 2022 |
12:30 – 1:30 PM
|Dr. Benjamin Tribelhorn||University of Portland||Dr. Tribelhorn will give a brief overview of unsupervised learning methods, feature extraction, and then detail current research in this field in the context of the Lorenz equations.This talk will explore the use of unsupervised machine learning methods to predict the onset of turbulent transitions in natural convection systems. The Lorenz system was chosen to test the machine learning methods due to the relative simplicity of the dynamic system. We generated 4 terabytes of data from solving the Lorenz equations over a large range of Raleigh ratios, geometries, and Prandtl numbers. We extracted and evaluated features within this data. We found that the unsupervised learning methods could be automated to find outputs that align well with well known key transition regions of the convection system.|
Co-organized with IT as part of the Computer Literacy Seminars. If missed, here is the recording.
|Computational Modeling of Chaotic Natural Convection Systems||April 6, 2022 |
12:30 – 1:30 PM
|Dr. Heather E. Dillon||Professor and Chair of Mechanical Engineering at SET, UW Tacoma||Dr. Heather Dillon will give an overview of the types of computational tools used by engineers to model complex systems. Natural convection is difficult to model because it is driven by density gradients, but it occurs in many systems we interact with each day. Examples include weather prediction, air flow in buildings, and cooling for electronics. This talk will explore the challenges in modeling natural convection computationally, and some of the ways we can use new tools like machine learning to help.|
Co-organized with IT as part of the Computer Literacy Seminars. If missed, here is the recording.
|MSCSS Thesis/Capstone||Mar. 11, 2022|
|MSCSS Students||Computer Science and Systems, School of Engineering and Technology, UW Tacoma||8:30AM – 9:00AM: Economical Multi-Clustering using Pre-trained Deep Models [Maham Rashid supervised by Dr. Juhua Hu]|
9:00AM – 9:30AM: OpERA: A Multi-Objective Optimization Approach for Edge Based Resource Allocation [Habiba Hussein Mohamed supervised by Dr. Eyhab Al-Masri]
9:30AM – 10:00AM: A Geospatial Method Detecting Map-Based Road Segment Discrepancies Across Map Providers [Jiawei Yao supervised by Dr. Eyhab Al-Masri and Dr. Mohamed Ali]
10:00AM – 10:30AM: A Real-Time Service Migration Model for Mobile Edge Computing Environments [Wenjun Yang supervised by Dr. Eyhab Al-Masri]
10:30AM – 11:00AM: Remistry: A Framework for Detecting and Mitigating Slow Subscribers, A Denial-of-Service Threat in IoT Environments [Yifeng Liu supervised by Dr. Eyhab Al-Masri]
11:00AM – 11:30AM: Enhanced Finger Movement Detection using sEMG by Data Augmentation [Larry Preuett supervised by Dr. Juhua Hu and Dr. Wei Cheng]
11:30AM – 12:00AM: Explainable Temporal Modeling of Traumatic Patient for Early Sepsis Onset Prediction as Rare Event in ICU [Tucker Stewart supervised by Dr. Juhua Hu, Dr. Ankur Teredesai, and Katherine Stern (MD, UW Medicine)]
|Use Cases and Challenges in Data Science for the United States Navy (USN)||Feb. 23, 2022|
|Daniel Lowney||Naval Undersea Warfare Center (NUWC) Division, Keyport||The United States Navy (USN), and Department of Defense (DOD) in general, face unique barriers and deficiencies in leveraging modern data science techniques including Artificial Intelligence (AI) and Machine Learning (ML) to solve today’s enterprise problems. While the USN has identified the need and a strategic plan to improve, they still face significant challenges. This presentation covers current applications of data science for specific use cases at NUWC Division Keyport including AI/ML models utilized for obsolescence management, supply support, signals analysis and classification, and tactical decision making – and the challenges limiting successful progress. An in-depth research study on condition-based maintenance is also explored including the methods, approaches, and results.|
|Understanding Maintenance History with Poorly Structured Data||Feb. 9, 2022|
|Will Gordon||Latchel||Latchel’s clients are property managers. They often have years worth of historical data on maintenance that has occurred at their properties. Latchel needs to analyze this data to show an ROI to their clients. In addition, Latchel should programmatically know what issues have occurred in the past and who has worked on properties before. Unfortunately, this historic data is rarely well structured or in a standard format.|
If missed, here is the recording.
|MSCSS Thesis|| Dec. 8, 2021|
|MSCSS Students||Computer Science and Systems, School of Engineering and Technology, UW Tacoma|| 10:30-11:00: Privacy-Preserving Transfer Learning for Human Activity Recognition [David Melanson supervised by Dr. Martine De Cock and Dr. Hee-Seok Kim]|
11:00-11:30: Real-Time Parking Sign Detection for Smart Street Parking [Yin Jin supervised by Dr. Juhua Hu and Dr. Wei Cheng]
12:00-12:30: Label Detection and Discrepancy Analysis across Map Providers Using Computer Vision [Adel Sabour supervised by Dr. Mohamed Ali and Dr. Eyhab Al-Masri]
12:30-01:00: Directionality Detection of Road Network Graphs across Map Providers Using Computer Vision [Abdulrahman Salama supervised by Dr. Mohamed Ali and Dr. Eyhab Al-Masri]
|Using Machine Learning to Tackle Complex Development Challenges||Nov. 17, 2021|
|Jeff Walters||Civil Engineering, School of Engineering and Technology, UW Tacoma||Increasingly, the international development sector is conferring on the notion that sustainable (or unsustainable) civil infrastructure is a result of a complex interplay of interdependent technical and non-technical factors. As always, the devil is in the details, and new and improved tools and methods are needed to consider system interactions and dynamics to improve decision making. Our research team is developing an automated process to create systems models called “factor maps” from transcribed interviews with infrastructure service stakeholders. This process uses machine learning to systematically identify and map and structurally analyze the interaction of factors that influence sustainable infrastructure systems.|
Co-organized with IT as part of the Computer Literacy Seminars. If you missed, here is the recording.
|How quantum computers will affect internet security||11/18/2020 Wed, 12:20-1:20||Anderson C. A. Nascimento||Computer Science and Systems, School of Engineering and Technology, UW Tacoma||In this talk, we will show how internet security and cryptography will be affected by developments in quantum computing. We will also show recent developments in cryptography and information security aimed at protecting us from a possible quantum computing related security threats. Co-organized with IT as part of the Computer Literacy Seminars https://www.tacoma.uw.edu/information-technology/computer-literacy-seminars|
All Other Past Seminars
|Title||Date of Presentation||Speaker||Affiliation||Research Focus|
|Cloud Computing Concepts for Everyone||4/18/2018||Wes J. Lloyd||Institute of Technology, UW Tacoma||When you mention “the cloud” many people think of the “iCloud” for music, or “Dropbox” for files, but in reality cloud computing encompasses much more. This session will provide an introduction to cloud computing concepts and terminology in a lecture plus activity based format to help convey the concepts of cloud computing to communicate what the cloud offers for everyone.|
|The Amazon Cloud||4/11/2018||Wes J. Lloyd||Institute of Technology, UW Tacoma||In the mid 2000s Amazon, based in Seattle, Washington, introduced the Amazon Simple Storage Service (S3) and Amazon Elastic Computer Cloud (EC2) services. These service offerings are considered by many as the services that launched the “invention of cloud computing” creating the multi-billion dollar cloud computing industry of today. This session will feature an overview of the general concepts of cloud computing from the perspective of two guest speakers from Amazon. The session will include an open question and answer (Q&A) forum with our guest speakers from Amazon.|
|Cloud Computing: How Cloud Computing is Helping Accelerate the Pace of Science||4/4/2018||Wes J. Lloyd||Institute of Technology, UW Tacoma||This session will start with an overview of the general concepts of cloud computing, and then focus on how scientists are leveraging the computational power of cloud computing to help solve problems and accelerate scientific discovery.|
|Making Mobile Health Technology Acceptable and Usable for Low-Income Safety-Net Patients||01/10/2018||Sharon S. Laing, PhD||University of Washington Tacoma Nursing and Healthcare Leadership Program||Abstract – Mobile health technology (mHealth) can reduce health disparities by improving health engagement among low-income populations. However, most mHealth tools are not specifically designed for communities largely affected by poor health outcomes.|
|Master Project Presentations- Lightning talks||11/29/2017||Various||Institute of Technology, UW Tacoma||Master students graduating this quarter will present a seven minute summary of their capstone projects. It is an amazing opportunity to get fast forward presentations of the work students have been doing over a year. Quick, exiting and inspiring!!!|
|Master Thesis Defense||05/31/2017||UW Tacoma Institute of Technology||Master Thesis Defense|
|Master Project Presentations – Lightning Talks||05/24/2017||UW Tacoma Institute of Technology||Master Project Presentations. These presentations will be ‘Lightning Talk’ style. Poster session followed by two 15 minute student presentations.|
|Video Games to Teach Security||04/19/2017||Dr. Mario Guimaraes||Saint Martin’s University||Abstract – This presentation will describe the benefits of using Video Games in Education, How Video Games fit with multiple disciplines, the market of video games, products used to develop Video Games, how video Games fit with Security. In Security, there is less opportunity for Active Learning than other areas of IT such as Programming, System Analysis or Databases. There is not a clear final product. A Professor can’t come in and create viruses. Video games offer the possibility to create simulations that bridge the gap between theory and practice.|
|Topologically Defined Flash Memory||04/05/2017||Yanjun Ma, PhD||Intellectual Ventures||Abstract – Present flash memory uses an analogue quantity, the threshold voltage of a transistor, to store digital data by dividing the available threshold voltage range into two (for SLC), four (MLC), or eight (TLC) levels. Hardwired thresholds are usually assigned to define the boundaries between levels. This hardware defined approach has served the non-volatile memory industry well but faces problems such as reduced program/erase endurance and complexity in dealing with the increasing error correction requirement.|
|Master Project Presentations – Lightning Talks||03/08/2017||Master Project Presentations – Lightning talks – 10 minutes per student followed by a poster session. This week’s event will run from 12:30 to 2:30.|
|Thesis Defense – A Sequentialization of Features Based Approach to Complex Event Sequence Prediction||03/01/2017||Darren Hon||University of Washington Tacoma||Thesis Defense, Darren Hon – Sequence based prediction takes an ordered list of events as input and makes predictions about the next event. Most existing work on sequence based prediction assumes that the sequences are simple, i.e. consisting of symbols drawn from a small alphabet (like a DNA sequence), or consisting of numbers (like a time series). In some applications, the events are a lot more complex.|
|Challenges in modern-day natural dialogue systems||01/11/2017||America Chambers||University of Puget Sound, Mathematics and Computer Science||Abstract – Natural dialogue systems embody the goals of the Turing test: a computer system that can hold a conversation in a manner indistinguishable from a human. Building such a system is still a non-trivial open problem in the field of natural language processing. Today’s most successful systems are rule-based and work well within a controlled conversational domain. However, these systems lack the ability to reason within the broader, unconstrained environment that is typical of human conversations.|
|IBM Watson AI and the AI XPrize||11/23/2016||Stephen Rondeau||Institute of Technology, UW Tacoma||Abstract – The amount of data will quadruple by 2020, and 80% of it is unstructured data (text, images, video). How can organizations effectively search through existing data or use that data to discover new, actionable information? IBM offers Watson Analytics, which is based in part on the AI technologies used to win the Jeopardy! game show. Those AI technologies will be explored and some open-source alternatives will be presented in brief. Can they be used by a team of people to solve a grand challenge and win a $3 million AI Xprize?|
|Generating Conflict-Free Treatments for Patients with Comorbidity using ASP||11/16/2016||Elie Merhej||University of Ghent, Belgium||Abstract – Conflicts in recommended medical interventions regularly arise when multiple treatments are simultaneously needed for patients with comorbid diseases. An approach that can automatically repair such inconsistencies and generate conflict-free combined treatments is thus a valuable aid for clinicians. We propose an answer set programming based method that detects and repairs conflicts between treatments. The answer sets of the program directly correspond to proposed treatments, accounting for multiple possible solutions if they exist.|
|Cost effective design-for-testability techniques and their implications on security||11/09/2016||Samah Mohamed Ahmed Saeed||Institute of Technology, UW Tacoma||Abstract -The growing complexity of Integrated Circuits (IC) increases the test cost dramatically, while advancements in fabrication decrease the manufacturing cost, rendering the test cost the dominant factor in the overall IC cost. Maximizing the test quality necessitates large test time, and thus high test cost. The need for lowering the test cost and improving the test quality has forced the semiconductor industry to develop and adopt design-for-testability (DfT) methods. Scan-based DfT techniques are used to enhance the testability of the IC.|
|Data Center: Backbone for Internet of Things||10/26/2016||Dr. Mani Prakash||Intel Corporation||Abstract – The Internet of Things (IoT) landscape is billions of connected devices that generate megabytes to terabytes of data from simple smartphones to complex infrastructures. In order to mine this data, efficiently, for useful purposes, there needs to be efficient hardware and software. Whether one is looking for patterns or useful conclusions from the data or one is looking for specific information from mounds of data, the hardware and software infrastructure needs to be able to deliver in a timely fashion.|
|Autonomic Management of Cost, Performance, and Resource Uncertainty for Deployment of Applications to Infrastructure-as-a-Service (IaaS) Clouds||10/19/2016||Wes J. Lloyd||Intitutue of Technology, UW Tacoma||Abstract – Infrastructure-as-a-Service (IaaS) clouds abstract physical hardware to provide computing resources on demand as a software service. This abstraction leads to the simplistic view that computing resources are homogeneous, and infinite scaling potential exists to easily resolve all performance challenges. Adoption of cloud computing presents challenges forcing practitioners to balance cost and performance tradeoffs to successfully host applications in the cloud.|
|Toward a Post-Quantum PKI||10/12/2016||Paulo S. L. M. Barreto||Intitutue of Technology, UW Tacoma||Abstract – Standardization organizations like NIST have recently announced their interest in developing cryptographic standards that resist attacks mounted with the help of quantum computers. Among those so-called post-quantum cryptosystems, arguably the foremost contenders are schemes whose security stems from the hardness of certain computational problems on lattices, to the extent that large companies like Google have already started experimenting with lattice-based key agreement protocols in some of their products.|
|Cloud Computing in Research at UW||05/18/2016||Rob Fatland||UW IT Director of Cloud and Data Solutions||Abstract – We are now witnessing (or participating in) the leading edge of adoption of cloud technology for basic research. From geosciences to oceanography to molecular engineering to genomics to astronomy to biogeochemistry to medicine: The data deluge is starting to meet its match thanks in part to unintended consequences of — to put it briefly — the Christmas holiday. To wit: As this holiday has driven demand in eCommerce the consequent supply of compute resources have created a glut of unused computing cycles over the remaining eleven months of the year.|
|The UW Primary Care Innovation Lab (PCI-Lab): Opportunities for new technological approaches and collaborations in primary care settings||05/04/2016||Matthew Thompson||Department of Family Medicine, UW||Abstract – The gulf between primary care clinical practice and adoption of new technologies seems vast. Demands on primary care are increasing, prompted by a growing need for coordinated care, an aging population, the rise in chronic diseases, and a shift from volume-based to value-based payment. Yet adoption of new technology has been poor in most primary care settings – relatively few new technologies are implemented. With nearly 1 billion ambulatory care visits per year in the US alone, the need (and market) for new approaches is vast.|
|Simulated Strategies for Finding a Mate||04/27/2016||Chris Marriott||Institute of Technology at the University of Washington Tacoma||Abstract – Evolutionary simulations are a tool for studying biological and societal phenomena. We use evolutionary simulations to study breeding strategies in populations of sexual reproducing agents. In this talk I will present the details of our evolutionary model and share the interesting results of our experiments. In particular our agents have evolved many interesting behaviors to ensure they can easily find a mate. These include formation of herds, assortative mating, natal philopatry, and eusocial division of reproductive labor.|
|Building Pervasive Geospatial Understanding of the Transportation Ecosystem||04/20/2016||Kenn Cartier||INRIX||Abstract – The global transportation system must respond to many forcing functions such as expanding urbanization, global warming, decreasing availability of resources, and changing expectations of travelers. As these pressures continue or accelerate, geospatial technology has become increasingly critical to optimizing and controlling many aspects of this system. Already underlying many familiar systems such as in-car navigation and intelligent personal assistant systems, geospatial technology will become more pervasive and transparent in the future.|
|Modern Platform-Supported Rootkits||04/13/2016||Rodrigo Rubira Branco||Intel Corporation in the Security Center of Excellence||Abstract -Talks on modern rootkit techniques are often presented in conferences around the world, but most of them basically updates existing techniques to work with new kernel improvements. This talk goes beyond and proposes a new approach: the usage of many architectural (x86-64) capabilities in order to have a resilient malware. Different aspects of the architecture are going to be explored and detailed in order to demonstrate attacker leverage against detection tools. Most of those features are widely available. Some of them, are niche or fairly new enhancements.|
|Social Media Users Modeling towards Personalized Advertisement||03/09/2016||Golnoosh Farnadi||Ghent University and Katholieke Universiteit Leuven||Abstract – Nowadays web users actively generate content on different social media platforms. A large number of users requiring personalized services creates a unique opportunity for researchers to explore user modelling. To distinguish users, recognizing their attributes such as personality, age and gender is essential. To this end, substantial research has been done by utilizing user generated content to recognize user attributes by applying different machine learning techniques.|
|Autonomic Management of Cost, Performance, and Resource Uncertainty for Deployment of Applications to Infrastructure-as-a-Service (IaaS) Clouds||03/02/2016||Wes Lloyd||Colorado State University||Abstract – Infrastructure-as-a-Service (IaaS) clouds abstract physical hardware to provide computing resources on demand as a software service. This abstraction leads to the simplistic view that computing resources are homogeneous, and infinite scaling potential exists to easily resolve all performance challenges. Adoption of cloud computing presents challenges forcing practitioners to balance cost and performance tradeoffs to successfully host applications in the cloud.|
|Machine Reading for Cancer Genomics||02/24/2016||Hoifung Poon||Microsoft Research||Abstract – Advances in sequencing technology have made available a plethora of genomics data for cancer research, yet the search for disease genes and drug targets remains a formidable challenge. Biological knowledge such as pathways can play an important role in this quest by constraining the search space and boosting the signal-to-noise ratio. The majority of knowledge resides in text such as journal articles, which has been undergoing its own explosive growth, making it mandatory to develop machine reading methods for automating knowledge extraction.|
|Mining sociotechnical information from software repositories||02/17/2016||Marco Gerosa||University of São Paulo||Abstract – A large amount of data is produced during collaborative software development. The analysis of this data sets a great opportunity to better understand software engineering from the perspective of evidence-based research. Mining software repositories studies have contributed to the discovery of important information about software development and evolution, considering both technical and social aspects.|
|Privacy Preserving Machine Learning Scoring: The Case of Decision Trees||02/10/2016||Prof. Anderson C A Nascimento, Ph.D.||Center for Data Science, University of Washington, Tacoma||Abstract – Privacy preserving machine learning scoring deals with the problem of scoring data x hold by Alice against a model M hold by Bob so that, at the end of the protocol, Alice should obtain the desired result M(x) and Bob should learn nothing about Alice’s input. Moreover, Alice should obtain no knowledge on the model M beyond what can be efficiently computed from her input x and M(x).|
|Understanding Automatic Source Code Summarization||02/03/2016||Paul McBurney||Microsoft Research||Abstract – Programmers rely on software documentation. Software documentation tells a programmer how to use a system, and how the system functions. However, software documentation is time-consuming to write and often becomes incomplete or outdated. To address the limitations and costs of software documentation, researchers have begun producing automatic source code summarization approaches. In my proposal, I discuss my ongoing and future work towards understanding and improving automatic source code summarization.|
|Distributed Optimization on Apache Spark||01/27/2016||Naveen Ramakrishnan||Data Science Research group at Bosch Research and Technology Center||Abstract – Most machine learning algorithms involve solving a convex optimization problem. Traditional in-memory convex optimization solvers do not scale well with the increase in data. In this work, we identify a generic convex problem for most machine learning algorithms and solve it using the Alternating Direction Method of Multipliers (ADMM). We implement this framework in Apache Spark and compare it with the widely used Machine Learning LIBrary (MLLIB) in Apache Spark 1.3.|
|Janine Terrano, CEO of Topia Technology Inc.||01/20/2016||Janine Terrano||Topia Technology Inc.||Bio – Janine Terrano, is the CEO of Topia Technology Inc. which she founded in 1999 to meet the growing demand for software solutions addressing data security challenges. Terrano spent the last decade developing and piloting programs for securely moving and managing data in complex distributed environments with the US Army, FAA, Air Force and TSA. Each of these customers required the highest level of security coupled with strict performance metrics—challenges met by Terrano and Topia’s seasoned engineering team.|
|Song Li, Co-founder and CTO of NewSky Security||01/13/2016||Song Li||NewSky Security – https://newskysecurity.com||Abstract – In this talk I will cover the two major game-changing factors in mobile and IoT age: wireless and small or no screen. Switching to wireless, we left two things behind us: the wires, and the trust infrastructure of internet. To make things worse, devices with small or no screens makes the weakest link in security – human – even weaker. I will provide some live demo to help explaining the topics.|
|Share Big Geodata at Web-Scale on AWS||12/02/2015||Mark Korver||Amazon Web Services||Abstract – Amazon Web Services has changed the economics of IT and has more than a million active customers in 190 countries, including 1,700 government agencies and 4,500 education institutions. AWS customers benefit from massive economies of scale on shared infrastructure, but they most often mention the speed and agility afforded by deploying on AWS as its most important feature. One of the services that makes this possible is Amazon Simple Storage Service (Amazon S3), our object store, and one of our first services.|
|Visual Analysis and Data Processing in Tableau||11/18/2015||Pawel Terlecki, Patrice Pelland||Tableau Software||
Abstract – Data visualization has been experiencing a rapid growth in recent years. An ability to predict and embrace many directions the market is taking greatly contributed to the current strong position of Tableau. An original research project has become a powerful visual framework to perform interactive exploration, reporting and storytelling over data.
|Deep Neural Network acoustic models for ASR||11/04/2015||Abdel-rahman Mohamed||Speech & Dialogue group at Microsoft Research||Abstract – In the past few years, Deep Neural Networks (DNNs) have achieved the state-of-the-art performance in acoustic modelling on many standard benchmarks breaking records long held by Gaussian mixture models (GMMs). In this talk, I will present our work in DNNs acoustic models and in understanding why DNNs are more sensible choice for acoustic modelling than GMMs.|
|Drug Interactions, Pharmacogenetics and the YouScript clinical decision support tool||10/28/2015||Rajeev Pany, Howard Coleman||Genelex||Abstract – Optimizing commonly prescribed drug treatments is challenging because of the prevalence of genetic variation in drug metabolizing enzymes and the frequency of multiple interactions between drugs and genes. The talk will introduce the medication interaction landscape and computational approaches to evaluating pharmacogenetic impact on patient drug regimens. We will describe some of the challenging problems we face, both in terms of modelling as well as algorithmic complexity involved in solving these problems.
|Fan-based consumption and technology innovation: Case studies of an alternative design/innovation model from the East||10/21/2015||Huatong Sun||University of Washington Tacoma||Abstract – In the age of big data more and more technology innovations are inspired by user’s creative consumption. Based on earlier findings from a research project sponsored by the RRF and IAS Research Fund, this talk will look at cultural consumption practices of users as fans from South Korea and China on various social media platforms and discuss how fan cultures at those sites contributed to local technology innovations for economic development.|
|Ethical Hacking||10/14/2015||Yuri Kocharov||eCommerce Foundation @ Nordstrom||Abstract – Given that security and privacy are under attack on daily basis, the need for ethical hackers is great. To be successful at defending your software and hardware systems from threats, one must first understand the mentality and motivation of “crackers”, learn their methods and then secure software utilizing defensive programming/engineering techniques. This talk will focus on an overview of threats, recent data and events, tools around penetration testing and a short near real-time demo of a penetration test.|
|Drug and genetic perturbations characterisations using network-based similarity method on the LINCS L1000 data||10/07/2015||Mushthofa||Ghent University||Abstract – The LINCS L1000 data comprises of a large number of gene expression profiles of many cancer cell lines treated under drugs and genetic perturbation. In this works, we intend to derive a biologically valid characterisations of the drugs and the perturbation through the use of a graph kernel similarity method and the integration of prior knowledge in the form of gene interaction networks. Graph kernel algorithms are performed on top of the network representation of the data and prior knowledge to obtain similarity scores.|
|Sarah Vluymans: Fuzzy and Fuzzy Rough Multi-instance Classifiers||08/12/2015||* Abstract: Multi-instance learning is a setting in supervised learning where the data consists of bags of instances. Samples in the dataset are groups of individual instances. A decision value is assigned to the entire bag, not its constituent elements. The classification of an unseen bag involves the prediction of the decision value based on the instances it contains. A variety of application domains include bioinformatics, text categorization and image recognition.|
|Science and Solutions for Low Resource Setting||07/22/2015||Abstract: Cardiovascular disease or abnormal function of heart and blood vessels is a leading cause of mortality worldwide. In US, heart disease accounts for 1 in every 4 deaths and is also the major cause of illness and disability, thus having large impact on health care system. In India, CVD is a rapidly growing epidemic that affects earlier in life (35–65 years), thus having significant social and financial impact. In absence of early corrective measures, CVD-related mortality is projected to account for 40% of all deaths in India within the next decade.|
|Challenges in Environmental Data||07/01/2015||Join us for a panel discussion as we discuss some of the challenges in aquiring, curating, and leveraging environmental data sets.
|Pavol Zajac: A connection between algebraic cryptanalysis of ciphers with low multiplicative complexity and decoding problem||06/03/2015||Pavol Zajac is an associate professor at UIM FEI STU, specialized in cryptology and IT security. He currently works on problems related to design of lightweight ciphers and algebraic cryptanalysis, and on the problems of secure implementation of post-quantum cryptography.|
|Inference of genetic networks from large scale expression data||05/27/2015||LINCS L1000 is a NIH funded project that has measured the expression levels of 1000 genes in 1.4 million experiments for 78 different cell types in the presence of over 40,000 chemical compounds. Our goal is to infer changes in genetic interactions that result from these perturbations in order to understand the pathways affected by known drugs and identify potential candidates for new drugs. I will describe new software for inferring genetic networks that is more accurate, more scalable and 300x faster than its predecessor that will be applied to this data.|
|Viliam Hromada: Fault Analysis of Stream Ciphers||05/20/2015||Viliam Hromada is a post-doc at UIM FEI STU, specialized in cryptology and IT security. His main research topics include stream ciphers and theoretical fault analysis of cryptosystems. He also currently works on a side-channel analysis of an implementation of the McEliece cryptosystem on an STM32F4 micro-controller.|
|Cale Berkey, Development lead at Decisive Data||05/13/2015||Every project, every team, every day brings new possibilities to learn; to learn about our clients, to learn about the technologies we use, and to learn about ourselves. My goal every day is to advance the frontiers of learning about self-organizing agile software development teams. At Decisive Data, I lead our development teams by asking questions about how we work, how we learn, and what more we can do to delight our clients with exactly the solutions they envision.|
|Spatial Predictive Queries||05/06/2015||In this seminar, we address spatial predictive queries both in Euclidian spaces and over road networks. We provide a definition for various types of spatial predictive queries, describe current research trends, and envision future directions. We present practical application scenarios and emphasize the roadblocks that are holding industry back from the commercialization of spatial predictive queries. This seminar targets audience in mobile data management, spatiotemporal query processing, mobile crowd sourcing, and tracking of moving objects.|
|Global Supply Chain Optimization at Amazon||04/29/2015||Adam Margulies (Principal Engineer) and Raman Iyer (Director, Software Development) talk about how Amazon uses deep research and data science to solve a variety of supply chain challenges at massive scale and complexity.|
|Yuri Kocharov: Search Technology at Nordstroms||04/22/2015||Join us for a conversation with Yuri Kocharov, Insitute of Technology Alum and Software Engineering manager for Nordstrom Search, as we discuss:
|Domain Generating Algorithms||04/15/2015||BIO: Anderson Nascimento is an assistant professor with the Institute of Technology of the University of Washington – Tacoma. and an adjunct professor with the department of Electrical Engineering – University of Brasilia.|
|What to Expect to Pay When You’re Expecting||04/08/2015||ABSTRACT: This talk will share the process of applying a human-centered design approach to conceptualize E$PECT — a mobile application that uses big data to estimate childbirth delivery costs.
BIO: Qiuyan Zhang and An Ping are both second year graduate students from the Human Centered Design & Engineering Department at the University of Washington. Interested in integrating user needs into product development, they worked with the Center for Data Science Team on this project over the Winter quarter as part of their Capstone project.
|Naveen Garg from NLPCore||04/01/2015||This week’s Center for Data Science Wednesday Seminar (April 1) features Naveen Garg from NLPCore giving us insight into his company and their innovative text mining platform. The talk runs from 12:00pm – 1:00pm in the UWT Tioga Building 3rd floor atrium. All are welcome. There will be coffee.|
|SEIU Healthcare NW Training Partnership and Health Benefit Trust: Data Project Overview||03/11/2015||ABSTRACT: SEIU Healthcare NW Training Partnership and Health Benefit Trust are two trusts established to train and develop professional long-term care workers to deliver high quality care and to champion innovative health care programs and coverage that improve the health of long-term care workers so they can deliver high quality care. In this talk, we will feature multiple engineering and data mining projects that are undergoing in the organization to support our care providers.|
|Martin Roetteler (Microsoft Research) – Quantum Computing — Power and Limitations||03/04/2015||Title: Quantum Computing — Power and Limitations
Abstract: Many research groups around the world are working on the
|Hasan Asfoor (UW CDS) Fuzzy Rough Set Approximations in Large Information Systems with Spark||02/25/2015||Abstract:|
|Dr. George Wu (Edifecs): Innovation in Health IT||02/18/2015||ABSTRACT: In order to truly impact health care delivery, innovators in health information technology need to understand what are some of the fundamental concepts that are driving outcomes, utilization, and cost. Besides analysis, engagement and future prediction, other things that we need to focus on are culture, experience, and emotional considerations. Think about a typical doctor’s visit when you were a child: you see a doctor, the doctor has a piece of paper and pencil, he/she tells you what’s wrong, what the treatment is, and what to do to stay healthy.|
|Student Updates (Lightning Talks)||02/11/2015|
|Next Generation Sequencing: An introduction to the technology and it’s applications||02/04/2015||Speaker: Roger E. Bumgarner, Associate Professor Department of Microbiology, University of Washington Seattle.|
|Innovation in Health IT||01/28/2015||ABSTRACT: In order to truly impact health care delivery, innovators in health information technology need to understand what are some of the fundamental concepts that are driving outcomes, utilization, and cost. Besides analysis, engagement and future prediction, other things that we need to focus on are culture, experience, and emotional considerations. Think about a typical doctor’s visit when you were a child: you see a doctor, the doctor has a piece of paper and pencil, he/she tells you what’s wrong, what the treatment is, and what to do to stay healthy.|
|Securing Cyberspace Starting at the Local Level||01/21/2015||There are over 39,000 State, local, territorial, and tribal government agencies across the United States. These are the local police, fire departments, public utilities, public transportation and traffic management, and 9-1-1 call centers that each of us as citizens rely on every day for our well being. These are the “first responders” in any emergency. Whether it is a physical event or a logical event does not matter: if these services suffer a compromise of their availability, integrity, or confidentiality, life can be disrupted.|
|Modeling and Simulation of Gene Regulatory Network Dynamics in Evolutionary Studies||01/14/2015||Gene regulatory networks (GRNs) are an essential part of the biological process that determines the biological characteristics of every living organism. In the context of evolutionary biology, one of the important goals of studying GRNs is to understand how rewiring of the network (due to genetic mutations) affects the biological properties of the individuals. In this talk, we will discuss how computational modeling and simulation of GRNs can be used to gain insights to answer these biological questions.|
|Student Updates II||12/03/2014||Speaker: Nitin Arya
Topic: Distributed KNN & characterization
|Student Updates||11/26/2014||Title: Named Entity Recognition
Speaker: Shruti Balabhadruni
Abstract: Named Entity Recognition (NER), i.e. the task of automatically identifying and extracting elements from predefined categories from texts, is an important problem in Natural Language Processing (NLP). In her presentation, Shruti will talk about how she tackled the challenges of NER for recognizing organisation and technology names in social media profiles.
|Understanding Collaborative Creativity in Scratch||11/19/2014||Abstract: Scratch is (http://scratch.mit.edu) a large online remixing community where millions of young people have built, shared, and collaborated on interactive animations and video games created using a mix of graphics, images, sound, and code. I will present a series of studies that use observational data from the Scratch online community to answer a series of questions about how young people create, collaborate and interact in informal learning communities.|
|The Practice of Data Science – Ying Li Chief Data Scientist at EV Analysis Corporation||11/12/2014||ABSTRACT: Amidst the proliferous definitions of data science, there is a common understanding of the practice of data science. To be a disciplined practice of data science that truly deserves the hyped social and economical attention, and more importantly that will scale and maximize to new potentials, a set of principles should be established to guide the practice.|
|NUMS Elliptic Curves and their Implementation||11/05/2014||Abstract: Trust in the standardized and widely used NIST curves has been affected by the Snowden revelations. As response, part of the cryptographic community has devoted significant effort into looking for alternative curves that may offer stronger security assurances and also achieve higher performance.|
|Computational protein structure prediction: a success story for the application of data science to natural science||10/29/2014||ABSTRACT
Proteins naturally fold into a consistent 3-dimensional structure largely determined by their sequence of amino acids. Hence it should be possible to predict the native fold given the gene sequence. Knowing the structure of the proteins encoded by genes helps us to understand and modify the underlying genetic functions.
|Machine Learning for Information Retrieval||10/22/2014||The talk will be a shallow introduction to some uses of Machine Learning (ML) in Information retrieval. The talk will be a survey in learning ranking functions, building query models, and document representations.
Speaker: Niranjan Balasubramaniun
|Managing Quality and Cost across the Healthcare Spectrum||10/15/2014||Prabhu Ram, AVP at Edifecs | Viren Prasad, General Manager, Engineering at Edifecs
This presentation will summarize major trends affecting the US healthcare and discuss their impacts to various stakeholders including patients, insurance companies, providers and employers. Approaches to solving some of the issues will also be discussed.
|Readmission Risk Using CT Imaging||10/08/2014||Girish Srinivasan, Clinical Solutions Leader at Samsung Electronics, will give a brief intro into Samsung’s foray into the medical business. He’ll then discuss his work on readmission risk prediction using CT imaging.|
|Models of Dynamic User Preferences||10/01/2014||Speaker: Komal Kapoor
Recommendation systems interact with people on a daily basis, serving as essential tools for navigation and decision making. Such systems have to deal with dynamic user interests and deliver relevant and engaging content over time. Changing preferences are a significant challenge for known recommendation methods. While techniques have been developed for tracking changes in preferences using time weighting and drift functions, a major drawback of these approaches is that they fail to incorporate insights from evolutionary psychology of preferences.
|Computing on Private Data and its Applications to the Health Care Industry||09/24/2014||Abstract: Public Cloud Storage Systems are an economically advantageous way for health care provides to manage health care data. However, in this scenario, there are significant privacy related issues. A naive use of cryptographic techniques do not solve these problems. In this talk we will survey recent research results aiming at adding privacy to public cloud computing systems and their applications to the health care industry.|
|An Overview of Research Directions and Innovation Opportunities at GISTIC, the GIS Technology Innovation Center, Makkaa, Saudi Arabia||09/17/2014||ABSTRACT|
|CDS Summer 2014 Research – Conference Reports||09/10/2014||This has been a busy summer at the Center for Data Science. At this week’s seminar, we will hear conference reports from KDD 2014, KDDBHI, and VLDB 2014. The talk will run from 12:00pm – 1:00pm in the UWT Tioga Building 3rd floor atrium. All are welcome.
———————————————————————————————Speaker: David HazelAbstract:
|Sean Sandys – Tyemill||09/03/2014||SPEAKER: Sean Sandys, Tyemill
|CDS Summer 2014 Research – GeoSpatial Data Management Group||08/27/2014||This week’s CDS Wednesday seminar (August 27th) will feature an overview of two projects in the GeoSpatial Data Management Group. The talk will run from 12:00pm – 1:00pm in the UWT Tioga Building 3rd floor atrium. All are welcome.
PreGo: Dynamic Multi-Preference Location Inference based Routing System
Presenter: Aqeel Bin Rustum
|User Experience and Human-Centered Design||08/20/2014||Speaker: Dr. Emma Rose
Title: User Experience and Human-Centered DesignHow do we know if the technologies we design are useful and usable to the people who use them? In this practical talk, Emma Rose will provide an overview of the field of user experience (UX) and the principles of human centered design. The talk will provide opportunities to brainstorm your current design projects so bring your ideas!
|Lightning Talks – CWDS Summer 2014 Research Threads||08/13/2014||Lightning Talks on Summer 2014 Projects to include:
– Risk of Readmission on Azure, engineering deployment
– Spark on Azure
– Inference of gene networks studying human cancers
|Simultaneous Fuzzy Rough Prototype Selection and Evolutionary Feature Selection||08/06/2014||Speaker: Nele Verbiest
Title: Simultaneous Fuzzy Rough Prototype Selection and Evolutionary Feature SelectionFaculty Sponsor: Martine De Cock
|Perfect Graphs and Their Applications in Brief||07/16/2014||Speaker: Dr. Rajat Kumar Pal
Title: Perfect Graphs and Their Applications in Brief
|Construction of Gene Networks from Expression Data||07/09/2014||During this research talk, Maciej Fronczuk will present his work with AML (Acute Myeloid Leukemia) data, involving the development of a Cytoscape plugin, and the DREAM 9 (AML Outcome Prediction) challenge, as part of his research on the construction of gene networks from expression data.|
|Big Data Architecture, In Memory DB and more||06/25/2014||Ankur will give a preview of his upcoming IEEE talk about in memory databases.
David will give a preview of his IEEE talk – Data Visualization in Education Data Sets.
Jeremy Parks will present on his Education Assessment Tool.
Raj will provide an update on his research project.
The seminar will wrap up with a round table discussion by Ankur’s independent study students. Their topic will be Big Data Architecture.
|Graduating Student Presentations||06/11/2014||Our Quarterly seminar closes with presentations by students doing research in the Center for Data Science. This multi-week program culminates with presentations from our Graduating Master Students.
Ji Zhang (image processing)
Kenny Kong (one last fly over)
|Student Project End of Quarter Updates II||06/04/2014||Our Quarterly seminar closes with presentations by students doing research in the Center for Data Science. This multi-week program culminates with presentations from our Graduating Master Students.
|CDS partnership with Edifecs||05/28/2014||Edifecs is a health exchange company that develops software solutions to enable providers to reduce costs, achieve regulatory compliance and accelerate reform. In this talk, we showcase our efforts with Edifecs towards two problems – cost prediction and risk of readmission. We demonstrate our work towards building applications for cost prediction and risk of readmission. This talk is geared towards a presentation with the Edifecs team summing up the work that has been completed since Fall 2013.|
|Machine Learning Models on Azure to Power Risk of Readmission||05/21/2014|
|Tanya Berger-Wolf – Computational Behavioral Ecology||05/14/2014||Computation has fundamentally changed the way we study nature. Recent breakthroughs in data collection technology, such as GPS and other mobile sensors, high definition cameras, satellite images, genotyping, and crowdsourcing, are giving biologists access to data about wild populations, from genetic to social interactions, that are orders of magnitude richer than any previously collected.|
|Jacob Nelson Grappa: faster data-intensive applications through latency tolerance||05/07/2014||Jacob Nelson – Grappa – data framework for graph applications
In this talk, I will present Grappa, a new open-source platform for accelerating in-memory data-intensive applications on commodity clusters.
|DoubleDown Interactive||04/30/2014||DoubleDown Interactive is the social game division of International Game Technology. Based in Seattle, DoubleDown develops non-gambling, casino style games on Facebook and mobile platforms. Our flagship product is the ultra-fast growing DoubleDown Casino™ on Facebook with 6 million monthly active users. As one of the top 10 game developers on Facebook we are committed to building a company that recognizes and rewards all employees, come and check us out! doubledowninteractive.com
|Overview of the Recent Database and Health-Informatics Conference Visit||04/09/2014||In this talk, I will briefly summarize my recent experience in attending HealthInf, EDBT, and ICDE. HealthInf is a biomedical engineering conference with adequate health informatics focus, whereas, the latter two are database conferences. I will introduce some of the interesting papers presented in those conferences that overlap significantly with the ongoing and future research interest of the center, direct the students to some valuable tutorial and keynote talks.
|Guy Van den Broeck||04/02/2014||Guy Van den Broeck is a postdoctoral researcher for the Automated Reasoning Group at the Computer Science Department University of California, Los Angeles (UCLA)|
|Student Project End of Quarter Updates II||03/19/2014||Students engaged with CWDS will present progress on their research. Students will give a summary of their research problem and motivation. Short demos of their end-product will be given and progress successess and issues will be discussed.|
|Student Project End of Quarter Updates I||03/12/2014||Students engaged with CWDS will present progress on their research. Students will give a summary of their research problem and motivation. Short demos of their end-product will be given and progress successess and issues will be discussed.|
|Student Project Midterm Update II||02/12/2014||Students engaged with CWDS will present progress on their research. Students will give a summary of their research problem and motivation. Short demos of their end-product will be given and progress successess and issues will be discussed.|
|Student Project Midterm Update I||02/05/2014|
|Post-doc Research: Abdeltawab Hendawi||01/29/2014||Title: Predictive Query Processing On Moving Objects
Abstract: A fundamental category of location based services relies on predictive queries which consider the anticipated future locations of users.
|Post-Doc Researcher: Golnoosh Farnadi||01/22/2014||Abstract
Social networking sites are becoming an important source of users’ interaction.
Their users generate lots of information about themselves and post it on these
sites, however similar to any real world data set, data are mostly incomplete,
vague, and uncertain. Understanding and modeling of social network sites (e.g.,
Facebook) in order to predict missing attribute values in proles of users would
help improving user modeling, recommendations, and personalization, among
|New Quarter, New Location||01/19/2014||The CWDS has moved into the UWT Research Commons in the Tioga Library Building. Along with our collaborative space shared with other Centers of UWT, we also have a new seminar presentation space that provides us with a state-of-the-art Video Wall, lots of natural light, and a configurable space for presentations, small group meetings or poster presentations. Please come and see the new space and learn how you may better interact with the Center in our new location.|
|CWDS Partnership with Edifecs||01/15/2014||U.S. healthcare expenditures continue to skyrocket, and more than $700 billion is wasted every year. That’s the problem Edifecs seeks to solve.|
|CWDS Weekly Seminar||11/06/2013||Naren Meadem and Deepthi Sistla will present updates on work they have been doing in the Center.
Naren has been configuring clusters in Hadoop and Mahout to help calculate waiting times in Urgent Care Centers.
|Joel Larson, Ankur Teredesai, David Hazel – Conference Highlights||10/16/2013||Joel Larson:
Joel attended SIGITE/RIIT 2013 a forum for sharing and developing ideas relating to Information Technology research, education, applications, IT-industry-academia relationships and our roles as professionals, educators, and advocates for the effective use of Information Technology.
|ChronoZoom: Travel Through Time for Education, Exploration, and Information Technology Research||10/09/2013||We describe the architecture, infrastructure requirements, and technical evolution of ChronoZoom, a unique infinite-zoom, temporal-data-visualization open-source platform. With ChronoZoom, it is possible to browse through time and history and fill the browser with events that span from 13.8 billion years to a single day.|