Abstract—NYSOL is an integrated framework of knowledge
discovery leveraged by a host of data processing and data
mining tools, which is underpinned by innovative research
activities. Our framework is designed for end-users to integrate
the process of managing large-scale information assets and
knowledge discovery on one platform to improve
interoperability between processes. The fundamental principle
of the framework is derived from direct processing of text-based
data by a set of user customizable commands for data
management, data processing, and data analysis, which greatly
simplifies the software architecture. The NYSOL framework
facilitates the knowledge discovery process in an efficient
manner for novice and expert users. This paper discusses the
historical development of NYSOL rooting from basic data
processing commands at command line, to the recent growth of
the NYSOL software ecosystem to extend additional
components for data mining based on efficient machine learning
algorithms. Initial experiments on NYSOL’s GGP large-scale
information processing architecture with NYSOL distributed
file system (NDFS) are also presented. Observed performance of
GGP demonstrates reduced overhead for inter-processing time
and improvements in overall processing time.
Index Terms—Big data, data mining, distributed processing,
information processing, knowledge discovery.
Stephane Cheung and Masakazu Nakamoto are with JST ERATO Minato
Discrete Structure Manipulation System Project, Japan. They are now with
Kwansei Gakuin University, Japan (e-mail:
stephane@erato.ist.hokudai.ac.jp, nain0606@gmail.com).
Yukinobu Hamuro is with Kwansei Gakuin University, Japan (e-mail:
hamuro@kwansei.ac.jp).
[PDF]
Cite: Stephane Cheung, Masakazu Nakamoto, and Yukinobu Hamuro, "NYSOL: A User-Centric Framework for Knowledge Discovery in Big Data," International Journal of Knowledge Engineering vol. 1, no. 3, pp. 214-218, 2015.