NIST Data Science Symposium
On November 18-19, the US National Institute of Standards and
Technology (NIST) is going to be hosting an interesting Data Science
Symposium that is focusing on benchmarking, measurement, reference
datasets and related issues. Many of the goals of this symposium echo
the ideas that have led NIST to play such a key role in advancing work
in information retrieval through programs like TREC over the
years.
Full information on the symposium is below.
Clifford Lynch
Director, CNI
------------------------
NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY
(NIST)
DATA SCIENCE SYMPOSIUM
NOVEMBER 18-19, 2013
(CO-LOCATED WITH TREC, TAC)
·
Registration for the inaugural NIST Data Science Symposium is
now open!
· For those
wishing to give presentations, participate as symposium panelists, or
present posters at the symposium, NIST is accepting technical
abstracts until Oct 4, 2013 (see details below).
SUMMARY:
Given the explosion of data production, storage capabilities,
communications technologies, computational power, and supporting
infrastructure, data science is now recognized as a highly-critical
growth area with impact across many sectors including science,
government, finance, health care, manufacturing, advertising, retail,
and others. Since data science technologies are being leveraged
to drive crucial decision making, it is of paramount importance to be
able to measure the performance of these technologies and to correctly
interpret their output. The NIST Information Technology Laboratory is
forming a cross-cutting data science program focused on driving
advancements in data science through system benchmarking and rigorous
measurement science.
BACKGROUND:
A variety of tools and methods are emerging that process,
analyze, and derive knowledge from large amounts of complex data in
order to provide new insights that underpin key decisions. This
has spawned the creation of Big Data technologies and an emerging data
science discipline spanning new large-scale analytic tools and
methods. Several approaches have emerged that combine many component
technologies in multi-stage flows, which include machine-driven data
transformation & processing, as well as human interactions and
decision points. These approaches often lack the necessary
measures for understanding: 1) the quality and context of the analyzed
data, 2) the rigor of the analytic process and tools employed, 3) the
impact of the human in the analytic process, and 4) the strength of
the conclusions derived, questions answered, hypotheses tested, and
discoveries made that emerge from the analytic process. The NIST Data
Science program seeks to engage in benchmarking and the development of
measurement methods to help advance the performance and efficiency
(resource utilization, speed, etc.) of Big Data analytic
components?-both independently and in the context of end to end
systems and workflows.
SYMPOSIUM DESCRIPTION:
The inaugural NIST Data Science Symposium will convene a diverse
multi-disciplinary community of stakeholders to promote the design,
development, and adoption of novel measurement science in order to
foster advances in Big Data processing, analytics, visualization,
interaction, and lifecycle management. It is set apart from
related symposia by our emphasis on advancing data science
technologies through:
· Benchmarking
of complex data-intensive analytic systems and subcomponents
· Developing
general, extensible performance metrics and measurement methods
· Creating
reference datasets & challenge problems grounded in rigorous
measurement science
· Coordination
of open, community-driven evaluations that focus on domains of general
interest.
Why You Should Attend:
This event will be of interest to data science researchers,
technologists, and data providers, as well as data science
stakeholders in Industry, Government and Academia. The symposium
will:
· Establish a
broad multi-sector community of interest including researchers,
end-users, and solution providers focused on advancing data science
and Big Data technologies
· Contribute to
the formulation of challenge problems to advance research and tools in
data science
· Facilitate
availability of reusable common reference datasets necessary to
systematically compare approaches and measure performance improvements
at all levels in Big Data analytic systems
· Foster
advances in data science by formulating new measurement methods and
benchmarks (e.g., accuracy, generalization, resource usage, cost,
speed, etc.)
· Foster
sharing of knowledge in a collaborative community-based forum with the
goal of accelerating progress and eliminating gaps in data science
methods and tools
REGISTRATION:
· Registration
to attend the NIST Data Science Symposium is now open
· Registration
is free, but it is necessary to register in order to attend
· The
deadline for registration will be on or before Monday, November
11. Registration may close once the capacity of the venue is
reached. Please note that only registered participants will be
permitted to enter the NIST campus to attend the workshop.
CALL FOR ABSTRACTS:
Participants who wish to give presentations of their technical
perspectives or present posters (potentially with technical
demonstrations) that address symposium topics should submit a brief
one-page abstract and brief one-paragraph bio to datascience@nist.gov by
October 4th, 2013. Submitters will be notified whether their
perspectives have been selected for plenary or poster presentation by
October 18th.
Speakers, panelists, and poster presenters will be selected by
the organizers based on relevance to symposium objectives and workshop
balance. Due to the technical nature of the workshop, no
marketing will be permitted.
SYMPOSIUM TOPICS:
· Measurement
methodologies, benchmarking, and common reference datasets needed to
accelerate data science research and improve performance of Big Data
analytic systems.
· Primary
challenges in and technical approaches to complex workflow components
of Big Data systems, including ETL, lifecycle management, analytics,
visualization & human-system interaction.
· Generation of
ground truth for large datasets and performance measurement with
limited or no ground truth.
POINTS OF CONTACT:
Ashit Talukder (NIST/ITL; Chief, Information Access Division),
Craig Greenberg (NIST/ITL)
In case of questions or if you would like to be added to our
mailing list, please send email to datascience@nist.gov.
|