Honghan and team illustrate how natural language processing (NLP) can help improve the efficiency of clinical coding in healthcare, by assigning ICD / SNOMED codes to hospital visits. This work aims to improve what is currently a very inefficient and erroneous process in the NHS.
The paper summarises developments of the symbolic, knowledge-based approach and deep learning based approach. Considering the ever changing coding standards and guidelines, the knowledge-based approach is much needed and the potential yet to be fully realised.
There were seven challenges discussed, and among them: missing high-quality benchmarking datasets; using heterogenous, incomplete and noisy sources; and learning from low resources.