 
                                    Application of artificial intelligence in the diagnosis of haematological disorders
- Biomedical Science Department, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Kepala Batas 13200, Malaysia
- Department of Medical Laboratory Sciences, Faculty of Allied Medical Sciences, Mutah University, al Karak, Jordan
- iomedical Science Department, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Kepala Batas 13200, Malaysia
Abstract
There is a broad spectrum of hematological diseases, and their origins can be attributed to a variety of factors, including genetic abnormalities such as leukemia, sickle cell anemia (SCA), and thalassemia, as well as conditions associated with the lack of certain blood components, such as iron deficiency anemia (IDA). Testing and analyzing for hematological disorders is intensive in terms of time, effort, and labor. Additionally, there is a higher chance of human error and variance during the manual examination and analysis of the test samples, depending on the expertise, skills, and experience of the examiner. Recent developments in artificial intelligence (AI), such as Machine Learning (ML) and Deep Learning (DL) algorithms—such as Random Forest (RF), Decision Tree (DT), and Support Vector Machine (SVM)—have demonstrated the considerable contribution they could make to more rapid and accurate disease diagnosis, detection, and classification. An increasing number of hematological diseases are being diagnosed using AI techniques, which combine tabular and image data to eliminate human error, generate more precise results, and decrease the time required for diagnosis. This review discusses several widely utilized AI disease evaluation algorithms and their applicability to hematological disorders. Additionally, we highlight key challenges such as the lack of accessible clinical data, which inhibits the implementation of AI in the field of medicine.
Introduction
An accurate early diagnosis is a crucial factor that plays a vital role in determining the treatment and prognosis for any disease, particularly in hematological disorders. In addition to the complete blood count (CBC), bone marrow (BM) and peripheral blood smears (PBS) examined under a microscope are two of the main tests used to diagnose a hematological disorder. However, the results and analysis of these diagnostic tests can vary based on the expertise and interpretation of the examiner. Hence, there is a high probability of significant human error, which could prove detrimental to a patient’s treatment plan and prognosis1.
Moreover, the time-intensive and labor-intensive nature of these tests adds to the testing costs, making them less accessible to a large portion of the population. For example, iron deficiency anemia (IDA) and thalassemia exhibit similar symptoms; therefore, when a misdiagnosis occurs, a thalassemic patient could be prescribed unnecessary iron supplements while the thalassemia could go untreated2. In the case of leukemia, a delayed diagnosis may result in rapid disease progression and possible death of the patient. Therefore, a correct diagnosis made at an early stage of the development of a hematological disorder is crucial as it enables medical professionals to develop suitable treatment plans. Automated approaches, such as artificial intelligence (AI), have been proposed as more accurate, time-efficient, and less labor-intensive means for disease detection1, 2, 3.
The early foundations of AI were laid in the 1970s with the introduction of rule-based systems like MYCIN, which were utilized for diagnosing bacterial infections and prescribing appropriate antibiotics4. Since the inception of AI in healthcare, it has advanced significantly and shown high success rates when utilized in numerous intelligent applications to solve several data-related issues5. A growing number of AI applications, and its subcategory Machine Learning (ML), are being developed in the field of medicine to assist clinicians in analyzing test results for rapid disease diagnosis and to devise better therapeutic interventions6, 7. This development was catalyzed by the abundance of clinical data combined with powerful AI tools, which have allowed for a wider range of applications in this area8.
The purpose of this review is to highlight the development of AI technology and its application in hematological diseases, as compared to other diseases like brain tumors where the field has been explored in detail. Additionally, with the recent exponential advancements in this technology, there are opportunities to implement a myriad of AI technologies. For example, analyzing blood cell morphology can be quite complex and variable, but it is becoming consistently more accurate and faster thanks to numerous AI algorithms and the availability of big data sets. Therefore, in this review, we discuss various models and examples of AI that are being applied in the field of medicine, particularly in hematological disorders. Additionally, the challenges faced in implementing AI in disease detection and classification are also discussed in this study.
Adjustments include spelling changes (., "hematological" instead of "haematological") for consistency and readability, the use of semicolons to connect related clauses, and slight wording changes for clarity and flow.
Methods
This review screened three databases (PubMed, Google Scholar, and ScienceDirect) for relevant literature related to the term(s). Studies that utilized AI in the field of hematology and hematological cancers were searched using keywords like artificial intelligence, clinical translation, deep learning, hematological disorders, machine learning, diagnostics, healthcare technology, leukemia, anemia, and other terms related to artificial intelligence and its applications in the medical domain.

Subcategories of artificial intelligence.
Machine Learning (ML)
ML is a part of AI that analyzes data samples to create models using mathematical and statistical approaches, allowing machines to learn without programming (Figure 1 ). When a computer program is given a set of tasks, it is said to have learned from its experience if its quantifiable performance on those tasks improves over time as it completes them9. ML was first used in a checkers game by Arthur Samuel in 1959, who utilized annotated moves from experienced players10. Since then, the algorithm has been validated and applied to a wide range of different applications1, 10, 11, 12. Currently, with the rapid increase in computational capabilities and data availability, training data-driven ML models has become more feasible, resulting in more time and cost-effective ML applications13.
ML is classified mainly into two types based on the type of input: supervised and unsupervised. The supervised learning ML algorithm makes use of “labeled” input and output training datasets to learn from in order to classify unlabeled datasets and predict the output accurately once the supervised ML algorithm is refined14. On the other hand, unsupervised ML examines unlabeled datasets and reveals unknown patterns using various clustering and/or association algorithms14. Supervised ML is grouped into two subtypes based on the type of output: classification and regression15. Classification is used to classify datasets into specific segments based on chosen parameters, while regression uses statistical methods to find a correlation between dependent and independent variables, which helps to make the cause-and-effect prediction. The following sections provide a bird’s eye view of the frequently used supervised ML algorithms utilized in the medical field due to the homogeneous and consistent nature of clinical tests along with the availability of sufficient data to form training datasets.
Decision Tree (DT)
The modern DT algorithm was developed in 1986 by John Quinlan, who developed features from a family and its members addressing the same task16. According to Suthaharan (2016), this supervised learning method could be interpreted as a “hierarchical domain division technique” due to its role in splitting the data domain, also known as a “node.” Ideally, an optimized DT model would split the dataset into subsets leading to maximum information gain, hence leading to better classification17. To support this hierarchical structure, DT comprises leaf nodes, branches, and a root node. Every internal node contains an attribute test, followed by branching of the test result, and finally, leaf nodes that indicate class labels18. In a small dataset, DT is simple to interpret and the accuracy of DT is comparable to other classification algorithms; however, this could lead to overfit classification (
The advantages and disadvantages between AI algorithms used in detection of diseases
| Algorithm | Advantages | Disadvantages | 
| Decision tree (DT) | •Small-sized trees are simple to interpret. | Over fit with classification in some dataset | 
| Random forest (RF) | •Managed unequal data sets with missing variables. | Over fit for some datasets with noisy classification task. | 
| Support vector machine (SVM) | •Works well in high dimensional space. | •Requires a long time for training with a huge data set. | 
| K-nearest neighbor (KNN) | •Implementation is simple and clear. | •Sensitivity to irrelevant or noisy data. | 
| Naïve Bayes (NB) | For classification, only a little quantity of training data is required. | Interaction between features cannot be learned because of feature independence. | 
| Convolutional Neural Network (CNN) | •It is used in image processing without the need for feature engineering. | •Longer time for training set | 
Random Forest (RF)
The RF algorithm, initially introduced by Breiman in 2001, comprises a collection of predictive trees that are aggregated such that each tree is constructed using randomly selected vector values and sampled simultaneously while adhering to the same distribution23, 24. During the process, the prediction that acquires most of the tree votes will be predicted as the main result. RF possesses the capability to handle datasets containing missing variables and uneven data. Additionally, it is capable of computing crucial features for classification, which establishes it as one of the most efficient classifier algorithms in the domain (
Support Vector Machine (SVM)
The SVM algorithm, first introduced in 1992 by Vladimir Vapnik, is utilized to determine a suitable class label to segment data samples29. SVM, in brief, is a learning algorithm that analyzes data used for categorization; when data patterns are highly dimensional and well-spaced, it is an excellent choice for classification30. With larger datasets, however, this approach requires an extended training period (
K-Nearest Neighbor (KNN)
KNN is a non-parametric classification algorithm developed by Joseph Hodges and Evelyn Fix in 195133. The model attempts to classify non-linear sample data points in a database into several classes34. As a result, it tends to identify irrelevant data that are distant, so it takes a significantly longer time in the testing procedure (
Naïve Bayes (NB)
NB is a classification method based on Bayes' Theorem, developed by Thomas Bayes in 176036. It is an independent predictor hypothesis that evaluates the degree of relationship between the classification variables. Essentially, the class with the greatest likelihood is the class with the highest probability. It is effective with small datasets and has been implemented in the classification of hematological disorders and heart disease, among other disease diagnoses37, 38, 39. For instance, in the study of cardiovascular disease, scientists utilized NB as a data mining technique to discover correlations in a database between variables such as age, sex, and fasting blood sugar37, 40. NB is also applied in the classification of red blood cells (RBCs) and sickle cells39.
Deep Learning (DL)
DL is a sub-domain of ML comprised of numerous layers that extract information from various input data formats, including images, numbers, . The DL model derives its structure and operations from the neural network found in the human brain and thus closely resembles them41. The structure of its processing units consists of output, hidden, and input layers. Employing units or nodes, each layer's nodes are connected to those in the layer beneath it; each link is assigned a weight42. DL is divided into three subtypes: supervised, semi-supervised, and unsupervised. The main DL models that are currently being practiced are artificial neural networks (ANN), deep neural networks (DNN), recurrent neural networks (RNN), and convolutional neural networks (CNN). Leukemia31, malaria43, and thalassemia19 are among the many diseases that have been diagnosed using CNN in medical image processing. A key advantage of DL compared to conventional ML is that it eliminates the necessity for segmentation and feature extraction operations that are intrinsic to ML’s functionality.
Convolutional Neural Network (CNN)
CNN, which Yann LeCun introduced in 1980 as a subfield of artificial neural networks (ANN), is widely employed in the domain of image processing3, 20. CNN operates by assigning weights to the components within an image and then differentiating them from the remainder of the image; this makes it faster in the testing set than any other method (
Performance Metrics in AI Application
Performance metrics are crucial for evaluating the effectiveness of AI models, especially in the medical field, where accuracy is paramount. In this context, these metrics play a vital role in the AI development pipeline. AI models that achieve over 90% in these evaluations are typically regarded as highly effective or reliable, given the high stakes involved in patient care and diagnosis. The metrics that are commonly used are accuracy, sensitivity/recall, specificity, F1-Score, and precision21.
: Accuracy is described as the proportion of true predictions made by a classifier to the total of all predictions as 
 Sensitivity gives only true positive measure considering total prediction and can be measured as 
Specificity calculates how many true negatives are correctly detected and identified as 
The F1 score is the harmonic mean of accuracy and recall. Its calculation is as 
Precision is defined as the metric that measures the accuracy of predicted positive observations to all expected positive observations. It is determined as 
Application of AI in the Diagnosis of Hematological Disorders
In the area of hematology, a precise and rapid diagnosis of disease is necessary, as some blood disorders, such as acute myeloid leukemia (AML), are highly heterogeneous and exhibit transcriptomic, proteomic, and metabolomic variations22, 44, 45. Furthermore, depending on the knowledge and expertise of the hematologist, there is a significant chance of human error, which may delay or result in an incorrect diagnosis of the disease, causing severe disease progression or death. Thus, there is a growing need for a medical system that can reduce diagnosis turnaround time while preventing human errors.
In 1995, AI was first applied in a hematology laboratory for peripheral blood interpretation, flow cytometry immunophenotyping, and bone marrow reporting, and it became a strong point due to its high accuracy and specificity46. The implementation of AI in hematology includes, but is not limited to, diagnosis, detection, and classification of diseases. Among the most common uses for machine learning (ML) in hematology are image processing, recognition, and classification8. Moreover, ML and deep learning (DL) have been applied to the detection of diseases utilizing visual and tabular data, among others47. Visual data contains images of blood and bone marrow smears, while tabular data provides information such as age, gender, and test results.
Relevant literature on ML-based leukaemia disease diagnostics
| Author | Year | Disease | Classifier | Dataset | Performance metrics | 
| Dasariragu  | 2020 | AML | RF | Images of PBS | 92.99% accuracy for detection 93.45 accuracy for immature classification | 
| Dese  | 2021 | Leukaemia | SVM | Images of PBS | Accuracy ALL-100% AML-93.33% CLL-100% CML-93.33% | 
| Kimura  | 2019 | MDS, AA | CNN | Images of PBS | Sensitivity-96.2% Specificity-100% | 
| Wang  | 2022 | MDS, AA, AML | CNN | Images of BM | Accuracy-91.4% for MDS 92.9% accuracy for MDS, AA, AML | 
Haematological Malignancies
Leukaemia
Leukaemia is a type of blood cancer characterized by the uncontrolled proliferation of immature white blood cells (WBCs), called blasts. It is divided into acute or chronic leukaemia based on the progression of the disease. Acute leukaemia is an aggressive disease with rapid development due to a sudden increase in blast cells, which may result in the death of patients if it is not treated at an early stage. In contrast, chronic leukaemia usually manifests at a slower rate32. Furthermore, leukaemia can be myeloid or lymphoid in origin, depending on the kind of cell afflicted by the disease. Therefore, leukaemia is separated into four types: Acute Lymphoid Leukaemia (ALL), Chronic Lymphoid Leukaemia (CLL), Acute Myeloid Leukaemia (AML), and Chronic Myeloid Leukaemia (CML)50.
Leukaemia is clinically diagnosed by examining cells from blood smears under a microscope. This procedure is time-consuming and may be subject to human error. AI solutions have addressed these issues by recognizing blast cells and helping pathologists and haematologists make decisions (
Another study classified AML subtypes, M1 and M2, in line with the French-American-British (FAB) classification system by constructing an ML model51. Thirty-one AML-M1 and nineteen AML-M2 bone marrow smear images were analyzed for variables, of which two morphological, six radiomics, and one clinical feature became part of the four classification models. The random forest model and broad learning system (BLS) were among the four other classifiers (Naive Bayes, KNN, SVM, ANN) used in this study, which found the random forest model with all nine variables performed the best, as shown by the average accuracy (0.998 ± 0.003), AUC (0.998 ± 0.004), F1-score (0.998 ± 0.004), recall (0.996 ± 0.009), and precision (1 ± 0) of the model12.
Furthermore, Dese (2021) suggested ML for all leukaemia types (ALL, CLL, AML, CML) by examining 250 images of blood smear slides that were taken from clinical samples. First, the KNN algorithm was used to segment WBCs by distinguishing them from RBCs and platelets, followed by the watershed method to separate overlapped cells. Next, the researchers used SVM for classification, whereby the combination of these two algorithms achieved the accuracy of 100%, 93.33%, 100%, 93.33%, and 100% in classifying ALL, AML, CLL, CML, and normal cell types, respectively32.
Multiple other investigations utilizing AI have also been able to differentiate leukaemia from other diseases. For example, Kimura (2019) investigated the CNN method by employing extreme gradient boosting (XGBoost) to differentiate myelodysplastic syndrome (MDS) from aplastic anaemia (AA) with a sensitivity of 96.2% and a specificity of 100%48. Another study employed bone marrow smear pictures analyzed using the DL technique to differentiate between AML, MDS, and AA49. In the same investigation, the group classified MDS from non-MDS with 91.4% accuracy using the CNN recognition model; they distinguished AA, MDS, and AML with 92.9% accuracy49.
Relevant literature on ML-based lymphoma disease diagnostics
| Author | Year | Disease | Classifier | Dataset | Performance metrics | 
| Achi  | 2019 | lymphoma | CNN | Images | Accuracy 100% | 
| Steinbuss  | 2021 | lymphoma | CNN | Images | Accuracy 95.56% | 
| Li  | 2020 | lymphoma | CNN | WSI | Accuracy 100% | 
Lymphoma
A class of diseases known as lymphomas is caused by malignant lymphocytes that accumulate in lymph nodes and other lymphoid tissue, resulting in the clinical sign of lymphadenopathy. Based on the histological presence of Reed-Sternberg (RS) cells in Hodgkin lymphoma, lymphomas are primarily classified as either non-Hodgkin lymphoma (NHL) or Hodgkin lymphoma (HL)55. It is difficult to diagnose and subtype NHL; information on these factors is needed in addition to morphological, clinical, serological, and possibly cytogenetic/molecular data54, 53.
Achi . (2019) developed a model to employ whole slide imaging (WSI) to distinguish between four forms of lymphomas: diffuse large B-cell lymphoma, small lymphocytic lymphoma, Burkitt lymphoma, and benign lymph node (
Furthermore, nodal diffuse large B-cell lymphoma (DLBCL), nodal small lymphocytic lymphoma/chronic lymphocytic leukaemia (SLL/CLL), and tumor-free lymph nodes (LNs) are the subjects of research by Steinbuss . (2021). The deep learning model (CNN) achieved an accuracy of 95.56% on the 16,960 test set that was isolated for testing (
Another use of deep learning, which is CNN, was conducted by Li . (2020) (
Relevant literature on ML-based multiple myeloma disease diagnostics
| Author | Year | disease | Classifier | Dataset | Performance metrics | 
| Wei  | 2021 | MM | GBDT, DNN, RF and SVM | Tabular data (blood tests) | ROC GBDT-0.9749 RF-0.9693 SVM-0.9052 DNN-0.8696 | 
| Chandradevan  | 2019 | MM | CNN | BMA smears | ROC Detection-(0.959 ± 0.008) Classification-(0.982 ± 0.03) | 
Multiple Myeloma
The neoplastic condition known as Multiple Myeloma (MM) is characterized by an accumulation of plasma cells in the bone marrow, the presence of monoclonal protein in the blood and/or urine, and associated tissue destruction in symptomatic individuals. Therefore, increased clonal plasma cells in the bone marrow, monoclonal protein in serum and/or urine, and related organ or tissue damage are used to diagnose MM55, 58.
Two of the several studies that researchers have conducted to identify and classify MM based on a variety of tabular and image data are used in this study (
Furthermore, because haematological diseases may influence the bone marrow, this research counted the cells in the marrow to differentiate multiple myeloma from AML. As a result, Chandradevan . (2019) are working on two algorithms: the VGG16 convolutional network for classification and the Faster Region-Based Convolutional Network for detection. High detection (0.959 ± 0.008 precision-recall AUC) and classification (0.982 ± 0.03 ROC AUC) accuracy were shown by the algorithms. To create and evaluate software algorithms, the Wright-stained bone marrow smears (BMA) of seventeen patients were obtained and transformed into whole-slide images, along with three AML and two MM instances. The selection of these illness cases was based on the proportion of malignant cells (20 – 30%) in MM and 30 – 50% in AML.
Numerous studies have used a variety of methods, such as flow cytometry, genetic expression, or medical image recognition, to predict the diagnosis of haematological malignancies. Nevertheless, no research in the literature has ever predicted or detected a blood problem using just the patient's CBC test findings. The latter may be a productive avenue to explore in the future since it is often recognized as the first diagnostic procedure used by haematologists to confirm a leukaemia diagnosis57.
Relevant literature on ML-based IDA disease diagnostics
| Author | Year | Disease | Classifier | Dataset | Performance metrics | 
| Lotfi  | 2015 | IDA | NNET, SVM, KNN With (Maximum Voting theory) | Images of PBS | Accuracy 99%- dacrocyte 97%- elliptocyte 100%- schistocyte | 
| Megha  | 2016 | IDA | ANN | Images of PBS | Accuracy 81%-Elliptocyte 77%-Dacrocyte 78%-Degmacyte 75%-Schistocyte | 
| Betül Çil  | 2020 | IDA, Thalassaemia | LR, KNN, SVM | Images of PBS | Accuracy 96.30%-females 94.37%-males 95.95%-both sexes | 
| Appiahene  | 2023 | IDA | CNN, KNN, NB, DT and SVM | Palm images | Accuracy 99.92%-CNN 99.92%-KNN 99.96%-NB 99.29-DT 96.34%-SVM | 
Anaemia
Anaemia is a haematological condition characterized by low RBC and haemoglobin levels. Anaemia is diagnosed through microscopic inspection of a blood smear, which provides detailed information about the size and structure of RBCs. Aside from RBC shape, various tests, such as CBC, contain a wide range of characteristics that can help detect anaemia. AI techniques for detecting anaemia have shown promising results, as detailed in the sections below.
Iron Deficiency Anaemia (IDA)
IDA is the most prevalent haematological disorder caused by a deficiency in iron, which is required for haemoglobin formation and oxygen transport in the body. Reduction in iron concentration causes a decrease in the number of RBCs and alteration in their shape and size. A few studies have reported on the use of ML for IDA diagnosis (
Due to the similarity in symptoms and testing outcomes between IDA and thalassemia, AI applications have been proposed to distinguish between the two. For example, Betül Çil. (2020) proposed a model to differentiate between IDA and thalassemia using CBC parameters from 342 patients. The authors used Logistic Regression (LR), KNN, SVM, Extreme Learning Machine (ELM), and Regularised Extreme Learning Machine (RELM) to detect IDA. The algorithms that produced the most accurate results, categorized by gender, were ELM and RELM (96.30% accuracy for females) and SVM (94.37% accuracy for males). RELM attained an accuracy rate of 95.95% across both genders2.
IDA is typically diagnosed through microscopic examination. However, Appiahene (2023) employed palm images to detect iron deficiency as anaemia causes paleness in many parts of the body. First, the palm images were extracted and augmented before the selection of the region of interest (ROI). Then, the image was processed in CIE L*a*b* color space for feature selection. Using five classifiers for the classification of data (CNN, KNN, NB, DT, and SVM), it was found that CNN performed the best with 99.92% accuracy, whereas SVM achieved the lowest accuracy of 96.34%61.
Relevant literature on ML-based thalassaemia disease diagnostics
| Author | Year | disease | Classifier | Dataset | Performance metrics | 
| Purwar  | 2021 | Thalassaemia | NB, RF, KNN | Images of PBS | 99%-accuracy 100%-specificity 100%-sensitivity | 
| Tyas  | 2020 | Thalassaemia | MLP | Images of PBS | 98.11%-accuracy | 
| Ashari  | 2020 | Thalassaemia | RF | Tabular data (CBC parameters) | 100%-accuracy, precision, and recall | 
Relevant literature on ML-based SCA disease diagnostics
| Author | Year | Disease | Classifier | Dataset | Performance metrics | 
| Patgiri  | 2021 | SCA | NB, KNN | Images of PBS | Accuracy 98.87%-KNN 98.21%-NB | 
| Sen  | 2021 | SCA | RF, LR, NB, SVM | Images of PBS | Accuracy 92%-RF 90%-LR 90%-NB 88%-SVM | 
| Haan  | 2020 | SCA | Images of PBS | 98%-accuracy | 
Thalassemia
Thalassemia is a genetic disease that occurs due to a mutation in the haemoglobin (Hb) gene responsible for the synthesis of haemoglobin in RBCs. The Hb gene contains two protein chains, alpha (α) and beta (β). Mutation in either of these genes will lead to a limited supply of oxygen to the organs, making a delayed diagnosis potentially life-threatening65. Recent developments in AI can expedite the verification of the diagnosis (
Furthermore, Tyas . (2020) detected nine abnormal erythrocytes in blood smears of thalassemia patients using multi-layer perceptron (MLP) neural networks. Morphological features, including texture and color features, were extracted from images through the segmentation process in this study. The model achieved 98.11% accuracy in differentiating between thalassemic and healthy samples62. Moreover, tabular data has also been analyzed using AI methods in several studies. For instance, Ashari (2020) used RF to classify thalassemia disease using haematological data. From ten CBC parameters, the model determined haematocrit as the most important variable, and RF analysis showed 100% for all the metrics they used in the study63.
Sickle Cell Anaemia (SCA)
SCA is an inherited haematological disorder caused by an alteration in the haemoglobin gene. Sickle-shaped RBCs, as a result of this disease, have restricted movement in the bloodstream, hence leading to a decrease in oxygen concentration in the body. Various AI methods have been proposed to detect SCA (
Additionally, Haan (2020) developed a deep-learning model to diagnose SCA using a blood smear image captured using a smartphone microscope. The structure of the model depends on two NN; the first NN works on improving the quality of the image to be identical to the laboratory microscope image, while the second NN utilizes the enhanced image to differentiate between sickle cells and normal cells. The proposed method achieved an accuracy of 98% when applied to 96 patient samples, including 32 samples taken from SCA patients64.
Challenges in Using AI for the Diagnosis of Haematological Disorders
The use of AI in disease diagnosis and classification is becoming increasingly widespread; however, researchers continue to face several challenges, some of which are discussed in this section.
Data Availability
The availability and accessibility of data are among the main issues facing researchers worldwide because AI requires a considerable amount of data67. In medicine, the lack of testing hampers the rapid growth of AI applications in certain diseases. For instance, AML disease can be diagnosed before a bone marrow biopsy is performed, but while researchers would like to employ AI on bone marrow samples, there is insufficient data on bone marrow images. Furthermore, the ethical concerns and preservation costs associated with sharing patients’ clinical data restrict medical centers and hospitals from contributing their information to public databases68.
Most importantly, the rapid growth of AI applications requires additional steps to validate the initial training datasets. Every supervised model is typically built and optimized using a considerable amount of data known as the 'training set' before conducting tests on a separate ‘testing dataset’. It is often required to validate using another collection of data, known as the ‘validation set’ or the original data set itself after model development6. Consequently, this emphasizes the necessity for enhanced accessibility to the necessary data. However, the issue with inadequate data can now be resolved through data augmentation, for which researchers employ a variety of techniques, such as image preprocessing, rotation, flipping, and sharing, to increase the size of the dataset and prevent memorization—particularly in deep learning algorithms69, 70.
Differences Between Making a Diagnosis and the Quality of Data
AI models require high-quality images and data for processing; however, obtaining a high-quality image could be complicated due to the existence of impurities in the image, imaging using low-resolution microscopes, or incorrect imaging parameters. The variance in quality in the sample microscopic slides could be caused by several reasons, including inaccurate smearing technique, mishandling of the sample, and the use of low-quality reagents such as stains32. One way of overcoming this is by obtaining a significant amount of data from many sources68. However, the variation in images and quality may generate “confusion” in the AI model as the training dataset will not have a coherent pattern. For example, in the case of diagnosing AML, some of the blast cells are indistinguishable, which may result in different annotations by different pathologists. Hence, taking the data from several sources can affect the AI model's accuracy, which can be overcome by re-annotating the data.
Algorithm Challenges
AI methods require a substantially extensive amount of data for the training set, particularly in DL methods. As a result, researchers must resort to techniques augmenting images using specific techniques, which reduces the percentage of accuracy in the result. An additional limitation of AI is that its models are incapable of comprehending situations that require explanations, or "black boxes" as they are referred to in CNN52. For example, pathologists require accurate interpretations of medical diagnoses; however, it should be noted that medically recognized disorders are frequently multidimensional, and there are cases when the reasoning is beyond human comprehension. Thus, techniques for dissecting these "black box" models have been created, which may assist in minimizing the uncertainty56. Lakkaraju (2017) offer a framework called “Black Box Explanations by Transparent Approximations” (BETA), a unique model to explain the behavior of any black box classifier by optimizing the fidelity to the original model and explanation interpretability at the same time. This approach enables the user to explore the black box behavior in different subspaces of their interest, thus eliminating the potential biases of the original model57.
Ethical and Social Implications
Generally, the integration of AI in healthcare has multiple caveats when considering data protection, informed consent, the humane aspect of medical consultations, and the social gaps which could lead to significant bias if AI is trained with non-representative datasets71, 72, 73. Although, in the context of diagnosis of haematological and haemato-oncological disorders, where the primary dataset comprises images of blood smear microscopic slides and numbers from a CBC report, it poses less risk of certain social and ethical implications as compared to using patients’ data in a field like genetic disorders where more sensitive data is involved.
However, regardless of the sensitivity of the information used, regulatory frameworks and bodies must be established to ensure the use of AI with minimal compromise on the ethical and social front. For example, it needs to be ensured that the socioeconomic gaps in our societies do not restrict access to AI so that the trained AI models can account for all variations and not lead to incorrect diagnoses. This is even more significant since blood disorders vary significantly across different ethnic and demographic groups74.
Conclusion
In this article, we reviewed different AI models and their application in the diagnosis of some haematological disorders. Advancements in this technology will not only speed up the process and increase the accuracy of clinical results, but they will also reduce healthcare costs significantly. AI has the potential to bring significant advancement in revolutionizing our healthcare systems. In certain circumstances, however, human intervention and expertise are still necessary to comprehend concepts that lie beyond the capabilities of AI as it can only process and analyze patterns and correlations, but not the context of the information. Researchers may face challenges as AI requires a large amount of data, leading to questions about its quality and integrity; however, this would be subdued when more data is made available. Regulatory frameworks set and enforced by appropriate bodies can be established to minimize the social and ethical implications of AI. As a roadmap for the future, the capabilities of this technology should be explored further in other areas such as personalized medicine, drug discovery, and medical imaging analysis as it has the potential to streamline numerous processes in our healthcare systems.
Abbreviations
AA - Aplastic Anemia, AI - Artificial Intelligence, ALL - Acute Lymphoblastic Leukemia, AML - Acute Myeloid Leukemia, ANN - Artificial Neural Network, BMA - Bone Marrow Aspirate, BM - Bone Marrow, CBC - Complete Blood Count, CLL - Chronic Lymphocytic Leukemia, CML - Chronic Myeloid Leukemia, CNN - Convolutional Neural Network, DLBCL - Diffuse Large B-Cell Lymphoma, DL - Deep Learning, DNN - Deep Neural Network, DT - Decision Tree, FAB - French-American-British (classification system), Hb - Hemoglobin, HL - Hodgkin LymphomaI, DA - Iron Deficiency Anemia, IgG, IgA, IgM - Immunoglobulins G, A, and M, KNN - K-Nearest Neighbor, MDS - Myelodysplastic Syndromes, ML - Machine Learning, MM - Multiple Myeloma, NB - Naïve Bayes, NHL - Non-Hodgkin Lymphoma, PBS - Peripheral Blood Smears, RF - Random Forest, RNN - Recurrent Neural Network, ROC - Receiver Operating Characteristic, SCA - Sickle Cell Anemia
Acknowledgments
None.
Author’s contributions
HA, and AI contributed to the design of the research. HA and ZA extracted the data, analyzed and summarized it. HA, ZA and HNMN drafted the manuscript. AI, MA, HY, NY edited and reviewed. AI, MA, HY supervised the research.All authors read and approved the final manuscript.
Funding
We would like to acknowledge the financial support provided by Universiti Sains Malaysia (USM), Malaysia, under the Short-Term Grant (STG) Scheme (STG Code: 304.CIPPT.6315621).
Availability of data and materials
Not applicable.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
 
                        