Mailing List Message #114768
From: Cliff Lynch <>
Sender: <>
Subject: Huge Data Workshop, Chicago, IL, April 13-14
Date: Tue, 18 Feb 2020 13:55:31 -0500
This workshop announcement will be of interest to the CNI community. Note that as of today registration has not yet opened.

Clifford Lynch
Director, CNI


          Large Scale Networking (LSN) Workshop on Huge Data:
        A Computing, Networking and Distributed Systems Perspective
                        April 13-14, 2020
             Sponsored by the National Science Foundation (NSF)
    Location: Chicago, IL co-located with FABRIC Community Visioning Workshop
    There is an ever-increasing demand in science and engineering, and
    arguably all areas of research, on the creation, analysis, archival and
    sharing of extremely large data sets - often referred to as “huge data”.
    For example, the blackhole image comes from 5 petabytes of data
    collected by the Event Horizon Telescope over a period of 7 days.
    Scientific instruments such as confocal and multiphoton microscopes
    generate huge images in the order of 10 GB per image and the total size
    can grow quickly when the number of images generated increases. The
    Large Hadron Collider generates 2000 petabytes of data over a typical 12
    hour run. These data sets reside at the high end of the “big data”
    spectrum and can include data sets that are continuously growing without
    bounds. They are often collected from distributed devices (e.g.,
    sensors), potentially processed on-site or at distributed clouds, and
    can be intentionally placed/duplicated in distributed sites for
    reliability, scalability and/or availability reasons. Data creation
    resulting from measurement, generation, and transformation over
    distributed locations is stressing the contemporary computing paradigm.
    Efficient processing, persistent availability and timely delivery
    (especially over wide-area) of huge data have become critically
    important to the success of scientific research.
    While distributed systems and networking research has well explored the
    fundamental challenges and solution space for a broad spectrum of
    distributed computing models operating on large data sets, the sheer
    size of the data in question today has well surpassed that assumed in
    prior research. To-date, the majority of computing systems and
    applications operate based on clear delineation of data movement and
    data computing. Data is moved from one or more data stores to a
    computing system, and then it is computed “locally” on that system. This
    paradigm consumes significant storage capacity at each computing system
    to hold the transferred data and data generated by the computation, as
    well as significant time for data transfer before and after the
    computation. Looking forward, researchers have begun to discuss the
    potential benefits of a completely new computing paradigm that more
    efficiently supports “in situ” computation of extremely large data at
    unprecedented scales across distributed computing systems interconnected
    by high speed networks, with high performance data transfer functions
    more closely integrated in software (e.g., operating systems) and
    hardware infrastructure than have been so far. Such a new paradigm has
    the potential to avoid bottlenecks for scientific discoveries and
    engineering innovations through much faster, efficient, and scalable
    computation across a globally distributed, highly interconnected and
    vast collection of data and computation infrastructure.
    This workshop intends to bring together domain scientists, network and
    systems researchers, and infrastructure providers, to understand the
    challenges and requirements of “huge-data” sciences and engineering
    research needs and explore new paradigms to address the problems
    associated with processing, storing, and transferring huge data. Topics
    of interest include, but are not limited to:
    ● huge data applications, requirements and challenges
    ● challenges of designing and working with devices for huge data generation
    ● storage systems for huge data
    ● software systems and network protocols for huge data
    ● in-network computing/storage for huge data
    ● software-defined networking and infrastructure for huge data
    ● infrastructure support for huge data
    ● debugging and troubleshooting of huge data infrastructure
    ● AI/ML technologies for huge data
    ● measuring the huge data transfer and computation
    ● scientific workflow of huge data
    ● access to (portions of) huge data sets
    ● protecting/securing (portions of) huge data sets

Organizing Committee
    Kuang-Ching Wang, Clemson University
    James Griffioen, University of Kentucky
    Ronald Hutchins, University of Virginia
    Zongming Fei, University of Kentucky
    Acknowledgment: The workshop is supported in part by the National
    Science Foundation (NSF) under grant CNS-1747856 and by NITRD Large
    Scale Networking (LSN) Interworking Group.

Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster