Search UTHealth Houston
Artificial Intelligence (AI) refers to a variety of computer-driven techniques designed to perform tasks traditionally requiring human judgment and expertise. From rule-based decision support to advanced self-learning systems, AI is reshaping how care is delivered, diagnoses are made, and research is conducted. Whether you are a nurse, physician, public health expert, or healthcare researcher, this guide offers a clear, conversational overview of major AI approaches and real-world examples to help you begin planning and implementing AI initiatives in your setting.
Supervised Learning Unsupervised Learning Reinforcement Learning Deep Learning Natural Language Processing Large Language Models Computer Vision Robotics and Automation
Supervised learning is like teaching a computer by example. You provide a labeled data set—say, thousands of X-ray images tagged as “pneumonia” or “normal”—and the model learns patterns that distinguish these classes. Once trained, the model can classify new cases automatically.
A deep CNN was trained on over 100,000 expert-graded retinal fundus images from the EyePACS program and multiple Indian clinics, and validated on two hold-out sets (EyePACS-1 and Messidor-2). At a high specificity operating point, the model achieved 90.3% sensitivity and 98.1% specificity on EyePACS-1, and 87.0% sensitivity and 98.5% specificity on Messidor-2, demonstrating ophthalmologist-level performance in detecting referable diabetic retinopathy. In a subsequent randomized trial in youth (ACCESS), autonomous deployment of this algorithm at the point of care increased diabetic eye-exam completion rates from 22% to 100% within six months, closing critical care gaps in diverse pediatric populations.
Relevant Publications:
Researchers fine-tuned a pre-trained CNN on a 129,000-image dermatology dataset. Transfer learning accelerated training and allowed the network to capture subtle color and texture cues that distinguish malignant from benign lesions (1).
Relevant Publications:
A hybrid deep learning model combining convolutional and recurrent layers learned to identify waveform morphologies from 90,000 single-lead ECG traces, matching cardiologist accuracy in detecting abnormal rhythms (1).
Relevant Publications:
Unsupervised learning explores data without provided labels, revealing hidden structures—like grouping patients into meaningful subtypes or spotting unusual cases that fall outside normal patterns.
In the Swedish ANDIS cohort of 8,980 newly diagnosed patients, k-means clustering on six baseline variables age at diagnosis, BMI, HbA1c, HOMA2-B, HOMA2-IR, and GADA autoantibody status revealed five distinct subgroups with divergent pathophysiology and rates of retinopathy, nephropathy, and cardiovascular complications; these clusters were subsequently validated in large Chinese and US cohorts, confirming their prognostic value for tailoring precision-medicine strategies.
Relevant Publications:
By applying latent class analysis to biomarker and clinical data from over 2,400 patients enrolled in the ARMA and ALVEOLI trials, researchers identified two reproducible ARDS subphenotypes “hyper-inflammatory” (higher shock and metabolic acidosis) and “hypo-inflammatory” which differed markedly in mortality (≈46% vs. 23%) and showed opposite responses to high- versus low-PEEP ventilation strategies.
Relevant Publications:
Leveraging hierarchical clustering of serial vital signs and laboratory trajectories from 20,000 MIMIC-III sepsis episodes, investigators defined four clinical phenotypes most prominently a “delta” cluster with escalating lactate and SOFA scores, associated with a 30% higher in-hospital mortality; these data-driven subgroups now underpin stratified trial designs and targeted sepsis interventions.
Relevant Publications:
Reinforcement learning (RL) teaches algorithms by reward. An RL agent interacts with a patient “environment”—either simulated or historical—and receives feedback (“rewards”) based on outcomes. Over many trials, it learns policies that maximize positive reward signals (e.g., survival, stable vitals).
Böck et al. and Wu et al. built on the MIMIC-III ICU repository to train an off-policy, distributional Q-learning agent that represents each patient’s state - vitals, laboratory values, administered treatments and optimizes a reward function tied to survival. In retrospective simulations, the AI Clinician’s suggested fluid and vasopressor dosing policies aligned with lower predicted mortality than typical clinician choices, demonstrating superhuman performance on historical sepsis management data.
Relevant Publications:
Liu et al. trained an RL agent on thousands of invasive mechanical ventilation episodes - encoding tidal volume, PEEP, FiO?, and patient responses to learn policies that maximize simulated patient “reward” (e.g., stable gas exchange, reduced ventilator-induced injury). Den Hengst et al. further infused ARDSnet guideline rules into the reward structure, and both studies showed that VentAI’s recommended ventilator settings outperformed retrospective clinician benchmarks in simulated outcome returns.
Relevant Publications:
Wang et al. and Desman et al. developed an RL framework that first constructs a patient-state model from longitudinal EHR glucose and insulin time-series in hospitalized patients with Type 2 diabetes. Using distributional RL, the agent then learns insulin dosing policies that optimize glycemic control metrics while penalizing hypoglycemia risk. In proof-of-concept trials, these learned policies achieved tighter glucose targets and eliminated severe hypoglycemia episodes compared to standard care.
Relevant Publications:
Deep learning refers to neural networks with many layers that can extract hierarchical features—edges, shapes, textures—from raw data like images or waveforms. These models have driven breakthroughs in pattern recognition tasks once thought out of reach.
A systematic review and meta-analysis of 18 prospective studies showed that deep-learning–based diabetic retinopathy screening algorithms including the original 10-layer CNN achieved a pooled sensitivity of 87.7% and specificity of 90.6% across diverse clinical settings. In the ACCESS randomized trial, autonomous AI eye exams increased screening completion in youth with diabetes from 22% to 100% within six months, closing critical care gaps in under-resourced populations.
Relevant Publications:
Manzoor et al. introduced a dual-stage deep-learning pipeline that first segments lesion boundaries in dermoscopic images, then classifies them with a DenseNet backbone achieving 92.3% overall accuracy and producing attention maps to highlight key image regions. Building on this, Arshad et al. developed a network-level fusion model combining multiple CNNs to localize and categorize over 20 lesion types, reaching a mean F1-score of 0.85 while offering saliency-based explanations for each decision.
Relevant Publications:
An et al. designed an ensemble of EfficientNetB0 and DenseNet121 augmented with attention modules, boosting pneumonia F1-score to 0.82 on the ChestX-ray14 test set - surpassing earlier single-model baselines. In a multicenter evaluation, Anderson et al. showed that radiologists aided by the FDA-cleared AI system improved overall chest-X-ray abnormality detection accuracy by 10.1% (AUC from 0.88 to 0.97), demonstrating substantial real-world assistive value.
Relevant Publications:
Classic NLP pipelines break free-text notes into tokens, map phrases to medical concepts, and identify context such as negation or temporality. These processes turn narrative text into structured data for research and decision support.
MedLEE employs a multi-stage pipeline - first parsing radiology and pathology narratives using a domain-specific grammar, then mapping parsed phrases to UMLS concepts via a coded dictionary, and finally filtering results through semantic constraints. In a systematic review of clinical data warehouses, Bazoge et al. (2023) reported that MedLEE achieved approximately 83% recall and 89% precision when extracting findings such as “pulmonary opacity” or “hepatocellular carcinoma” from free-text reports, illustrating how rule-based parsing can reliably convert narrative text into structured, research-ready data.
Relevant Publications:
cTAKES chains together sentence splitting, tokenization, dictionary lookup (leveraging UMLS and SNOMED-CT), and the built-in assertion module to extract clinical entities and detect negation or uncertainty. Kim et al. demonstrated its scalability by processing over one million notes to identify housing and food insecurity, achieving a positive predictive value of 77.5% for housing-issue mentions. Lossio-Ventura et al. further benchmarked cTAKES on real-world EHR data, showing F1-scores above 0.80 for core concept recognition tasks and underscoring its versatility across specialties.
Relevant Publications:
Building on simple negation detection, ConText applies rule-based “scope windows” around trigger terms to label whether a finding is negated, historical, or pertains to someone other than the patient. Mirzapour et al. adapted this algorithm for French clinical notes - achieving F1-scores of 0.93 for negation and 0.86 for temporality, while Slater et al. reimplemented ConText logic with dependency-grammar heuristics, processing discharge summaries over 2,000 sentences per second and maintaining >90% accuracy in identifying present vs. historic mentions.
Relevant Publications:
LLMs are transformer-based models trained on massive text collections. They can perform a variety of language tasks—from generating summaries to answering questions—often with little or no task-specific training data.
Med-PaLM 2 builds on PaLM2 by fine-tuning on millions of medical QA pairs from MultiMedQA and USMLE-style datasets, and employs ensemble-refinement and chain-of-retrieval prompting to improve long-form reasoning. It scored 86.5% on the MedQA benchmark, 19% above the original Med-PaLM and achieved state-of-the-art accuracy on MedMCQA, PubMedQA, and MMLU clinical topics. In head-to-head physician evaluations on 1,066 consumer medical questions, clinicians ranked Med-PaLM 2’s answers higher than human-written answers on eight of nine utility metrics (p < 0.001).
Relevant Publications:
DistillNote combines a retrieval-augmented generation (RAG) pipeline first retrieving top-k relevant passages from a clinical-note vector store, then prompting an Llama 2 (13B) model to summarize key findings into structured summaries. In an aged-care study on EHR malnutrition data, zero-shot DistillNote summaries reached >90% accuracy against a gold-standard dataset and, when used as features in downstream predictive models, improved AUC by 7% over non-RAG baselines.
Relevant Publications:
This system augments an LLM with real-time retrieval of clinical-guideline passages: prescriptions and patient context are used to fetch relevant guideline snippets, which the LLM then uses to flag potential prescribing errors. In simulated vignette tests in ophthalmology, the RAG-LLM framework achieved 92% precision in error identification and increased pharmacist correction rates by 35%. A hepatology implementation (ChatZOC) retrieved 300,000-document specialty corpus to answer 300 clinical questions, outperforming GPT-4 by 12% in guideline adherence (p < 0.01).
Relevant Publications:
Computer vision applies convolutional and transformer-based networks to medical imagery—X-rays, endoscopy video, histopathology slides—automating detection, segmentation, and quantification tasks.
A deep segmentation CNN was seamlessly integrated into live endoscopic video feeds, where pixel-level annotation models flagged suspicious mucosal protrusions in real time. In a prospective randomized trial, this system delivered an absolute increase of 7.3 percentage points in adenoma detection rate translating to a 29% relative uplift by alerting endoscopists to subtle, flat lesions they might otherwise miss.
Relevant Publications:
Leveraging a two-stage object-detection network trained on over 24,000 multi-site mammograms, AI-STREAM overlays bounding boxes on suspicious masses and microcalcifications. In a forward-looking, multicenter cohort trial, radiologists assisted by the model achieved a 13.8% higher invasive-cancer detection rate without any increase in recall rates, demonstrating how AI can boost sensitivity while preserving specificity.
Relevant Publications:
A patch-wise CNN ensemble scanned digitized lymph-node whole-slide images in 256×256-pixel tiles, assigning metastasis probability scores that were aggregated into slide-level predictions. In the CAMELYON16 challenge, this approach reached an AUC of 0.994 for detecting breast-cancer metastases on par with expert pathologists, highlighting the power of weakly supervised deep learning on gigapixel histopathology datasets.
Relevant Publications:
Robotics & automation bring AI into both physical and administrative workflows—from robot-assisted surgery to software bots that automate repetitive tasks.
By translating surgeon hand movements into ultra–fine-scaled robotic–arm actions with integrated tremor suppression and motion scaling, the da Vinci Surgical System aims to enhance precision in minimally invasive procedures; however, Csirzó et al. found no significant difference in clinical outcomes for endometriosis surgery compared to conventional laparoscopy, underscoring the importance of procedure-specific evaluation. Conversely, Wang et al. demonstrated that obese patients undergoing da Vinci–assisted radical prostatectomy achieved perioperative, functional, and oncologic results comparable to non-robotic approaches, suggesting that patient-body-habitus won’t limit robotic efficacy in high-BMI cohorts.
Relevant Publications:
Automated dispensing cabinets combine barcode scanning with robotic-arm retrieval to assemble medication orders; a systematic review by Shbaily et al. reported an 80% reduction in overall dispensing errors immediately after automation, with these gains maintained when integrating pharmacy support staff. In surgical and ambulatory-surgery settings, Borrelli et al. found ADC implementation reduced controlled-substance discrepancies by 16%–62.5% and medication errors by 23% up to 100% in some studies, while user-satisfaction rates exceeded 81% and labor hours decreased—demonstrating broad clinical, operational, and economic benefits.
Relevant Publications:
In opioid-treatment facilities, automated liquid-handling stations integrated with EHR-driven barcoding prepare, label, and dispense methadone under nurse oversight; Al Nemari & Waterson reported that post-automation dispensing errors fell from 1.0% to 0.24%, incomplete prescriptions decreased from 3.0% to 1.83%, and total patient-department time dropped from 17.09 to 11.81 minutes—liberating pharmacists for higher-value time. Complementing this, Takase et al. showed that robotic dispensing systems cut total dispensing errors by ~80% and nearly eliminated wrong-strength and wrong-drug incidents, underscoring their role in high-risk medication environments.
Relevant Publications: