C14: Learning from difficult data

Paper Submission for C14

SMC2018:C14 submission site (external site)

Abstracts

Difficulties embedded within characteristics of real-life data pose various challenges to contemporary machine learning algorithms. The performance of learning algorithms may be strongly impaired by adverse data characteristics, such as data velocity, imbalanced distributions, high number of classes, high-dimensional feature spaces, small or extremely high number of learning examples, limited access to ground truth, data incompleteness, or concept drift (i.e., parameter change of the probabilistic characteristics describing data), to enumerate only a few.
The main aim of this section is to bring together researchers and scientists from basic computing disciplines (computer science and mathemathics) and researchers from various application areas who are pioneering data analysis methods in sciences, as well as in humanitarian fields, to discuss problems and solutions in the area of data difficulties, to identify new issues, and to shape future directions for research.

The list of possible topics includes, but is not limited to:

  • class imbalanced learning
  • learning from data streams
  • learning in the presence of concept drift
  • learning with limited ground truth access
  • learning from high dimensional data
  • learning on the basis of limited data sets, including one-shot learning
  • instance and prototype selection
  • data imputation methods
  • case studies and real-world applications affected by data difficulties

Session Chairs