About

About

Rajesh More

PhD— Technical Architect, AI/ML Expert, Mentor

Doctor of Philosophy (Gen AI) – IIT Patna

Empowering innovation with AI, nurturing the next generation of AI leaders.

Welcome! I’m Rajesh More—a Technical Architect with 10+ years of expertise in designing,deploying, and scaling AI/ML solutions across healthcare, e-commerce, fintech, and
manufacturing.Currently at CWX, I lead global AI teams, drive research, and architect production-grade AI pipelines on GCP, AWS, and Azure.

My journey blends deep technical skills with a passion for teaching and mentoring. Through
Deccan AI School, I guide ambitious learners toward impactful AI careers.
Whether you’re a recruiter, a collaborator, or an aspiring student—let’s connect and build
the future together

portfolio

Projects

Lung Cancer Risk Prediction Pipeline

Built production ML pipeline for medical imaging using GCP, Kubeflow, and deep learning models. Automated training, deployment, and monitoring for scalability.

Secure Enterprise Search Application On PII Data & AIOPs - Back Pocket App

Designed an enterprise search solution using GCP tools (DLP API, Firestore, BigQuery, GCS, GAR, GKE). Led a 6-member AI, Data & Infra team to deliver a secure and scalable application.

Product Recommendation and Optimization - Model Engineering + AIOps

Analyzed customer purchase journey (page views, cart actions, purchases) across mobile vs desktop. Measured CTR and conversion rates for optimization.

Imagery Analytics - MLOps

Orchestrated image classification models using Kubeflow & Vertex AI pipelines. Managed continuous training, versioning, and deployments with BigQuery, GCS, Jenkins & GitHub.

Patents & Publications and Conferences:

IEEE Paper: Estimation of Response Time with Neural Networks in Cloud Computing System

IEEE Paper: Multimedia Data Processing Using Hybrid Blockchain Technologies in Healthcare Industry

IEEE Paper: Synergistic Neural Matrix Factorization: Elevating Complementary Product Recommendation in E-Commerce

IEEE Paper:: A Comparative Analysis of Artificial Intelligence Algorithm Using Big Data

Patent: Smart Public Transportation System with IoT Routing Optimization (2024)

Patent: Multimedia Data Processing Using Hybrid Blockchain Technologies in Healthcare Industry

Best Researcher Award: ICETBP 2024

IEEE Paper: Hybrid Blockchain for Healthcare Multimedia (2024)

Certifications

Google Cloud Certified Professional Machine Learning Engineer

Google Cloud Certified Professional Cloud Architect

Google Cloud Certified Generative AI Leader

Google Cloud Certified Associate Cloud Engineer

AWS Certified Machine Learning Engineer - Associate

Project overview

Education

Lung Cancer Risk Prediction Pipeline with Sybil AI

  • Led end-to-end solution architecture by designing scalable blueprints, drafting Statements of Work (SOWs), and defining Level of Effort (LOE) for engineering teams.
  • Architected and implemented a production-grade ML pipeline for lung cancer risk prediction using Kubeflow Pipelines on GCP, enabling automated processing of CT scan imaging data at scale
  • Engineered a modular, containerized workflow for the Sybil deep learning model, which generates 6-year cancer risk predictions with 512-dimensional feature embeddings and attention maps for clinical interpretability.
  • Developed specialized DICOM preprocessing algorithms to select optimal CT series based on reconstruction kernel parameters (B50f/LUNG) and image count metrics, enhancing model input quality Implemented high-throughput data ingestion from GCS buckets for DICOM medical imaging, with batch processing capabilities optimized for NVIDIA L4 GPU acceleration (4x GPUs, 16 CPUs, 60GB memory).
  • Created a RESTful API service using Flask to expose real-time inference endpoints, enabling integration with clinical workflows and third-party applications.
  • Implemented end-to-end pipeline orchestration with automatic model feature extraction, supporting downstream sequential modeling for longitudinal patient risk assessment.
  • Designed comprehensive data serialization protocols for model outputs, preserving hidden layer representations and attention maps for explainable AI in the medical domain.

Secure Enterprise Search Application On PII Data & AIOPs - Back Pocket App

  • Designed solution using GCP Native Tools like DLP API, Firestore – NoSQL database, BigQuery, GCS Bucket, GAR and GKE
  • Led an Application Development team consists of 6 AI Engineers, Data Engineers and Infrastructure Team for successful delivery of the project.
  • Developed Retrieval Augmented Generation Application (CI/CD Pipeline) on Personally Identifiable Information PII data.
  • Leveraged Google DLP API to anonymise and de anonymise PII data from corpus. Gemini 1.5 Pro used as an agent to get relevant information from provided KB.
  • Developed Fire Store Database to save mappings and Leveraged ChromaDB to save embeddings and used for retrieval. Evaluated the performance using RapidEvalSDK and Ragas.
  • Enabled authorisation to access secure endpoint using OAuth and JWT. Deployed the FASTAPI application Google Kubernetes Engine and exposed application to external ip using load balancer.
  • Enabled Monitoring to save the results in BQ Table and build Looker Dashboard for metrics visualisation – Performance Metrics, API usage etc

 

LLMOps - Deployment of llama 3 70 billion instruct on Vertex AI/GKE and Flux Schnell Finetuning [ A100 GPU ]

  • llama 3 deployment: Utilized vLLM (vLLM Engine) for high-performance inference on large language models. Enabled PagedAttention mechanism for efficient memory usage and high-throughput inference. Deployed llama 3.2 on Vertex AI Endpoints with optimized GPUs and GKE (Google Kubernetes Engine) for scalable serving. Leveraged AutoSXS pipeline to perform pairwise model-based evaluation.
  • Finetune Flux Schnell Model (Text To Image): Finetuned Flux 1. Schnell model using A100 GPU and ostris ai toolkit to achieve Thomas Kole Style Images. Carried of Caption Engineering and curated 10 image related to specific style. Leveraged LoRA adaptor and optimised parameters like trigger work, number of steps etc to achieve desired results. Developed training and inference container and deployed on GKE.

Product Recommendation and Optimization - Model Engineering + AIOps

  • Worked on analysing customer purchase journey – Page Views, Product Views, Cart Item Addition, Product Purchase, Product attributes (BCOM and MCOM). Compared the behaviour of Mobile Vs Desktop users and calculated conversion rate, CTR etc.
  • Worked on Personalised Titles Project. Carried out data analysis to find out the optimum number of look back days and percentage cutoff value for brand use case. The analysis is carried out for both BCOM and MCOM data.
  • Worked on Customer Segmentation for BCOM and MCOM data. Carried out Recency-Frequency-Monetary Analysis using customer transaction data. Carried out feature engineering to build new features like Recency, Frequency, Monetary, Age, CLTV, RFM Score etc. Leveraged lifetimes package and utilized probabilistic models like beta-geo fitter and gamma-gamma fitter to find Customer Lifetime Value and predicted sales.
  • Complementary Products Analysis: Carried out Market Basket Analysis to get the complementary products. Used ‘Customer Purchase’ and ‘Add To Bag’ data to find out associated items. Filtered the transactions having more than one items added in the cart and used Apriori algorithm to find out the Support value and Association Rule Mining to get the frequently bought items or items sold together. Experimented with different hyperparameters like “min_support”, “metric”, “min threshold” to get the complementary products for all “Site Product Type” Validated the results by using different datasets across different dates and did the same analysis.Carried out research on other approaches like ‘Cleora’ and ‘Orange’ MLOps Activities/Deployment: Configured GCP SDK shell locally. Extracted Customer Transactions records from Bigquery Table. 
  • Data Versioned with the help of BQ Datasets. Carried out experiments and created models using Vertex AI user managed notebooks (TensorFlow enterprise) (Workbench) in dev using Kubeflow Pipelines. Created Endpoint and deployed the model to
  • the endpoint to get batch predictions. Uploaded model in GCS Bucket to deploy it in higher envrionments. Leveraged Git and
  • Gitlab for development and version control for project. Leverage Data Studio for visualization.
  • Worked on Refactoring HALS Recommendation System Models – Content Based and Collaborative Filtering
  •  

Sentence Similarity, Summarization and Image Captioning (LLMOps - Deployment Of Blip2, T5-Flan base and T5 Flan medium) - Vertex AI

  • T5-Flan: Finetune T5 Flan using property description data using Vertex AI Pipelines, uploaded the model files to GCS Bucket. Developed and Containerised the Flask Application.
  • Blip2: Deployed Blip2 model for Property Data Image Captioning using prebuild Vertex AI’s Model Garden Container for Online Prediction.
  • Deployed Models at Endpoints in usc1 and usw1 regions. Enabled
  • Request-Response Logging to populate requests and responses with 100% sampling rate. Integrated the endpoints in Uptrends Monitoring and Observability. Created BQ Views by dropping the redundant requests.
  • For CICD, Jenkins Job used to pull the Github repo, Fetch the model and build the image from dockerfile (custom container).
  • The containers are being registered to GCR/GAR and JFROG artifactory. These steps are followed by Endpoint creating, Uploading model to model registry and deployment of model for online predictons.
  • Leveraged standard practices like Cast Scan, SonarQube Scan and Veracode scan for code quality and security. The automated scan carried out through Jenkins Job. Deploy By Release Job is used for Prod deployment. Various SDKs are are leveraged e.g. Python, Gcloud Commands to carry out CI-CD.

Imagery Analytics - MLOps

  • Orchestrated Python scripts of Image Classification Pytorch Models by using Kubeflow Pipelines and Vertex AI Pipelines in Workbench for Continuous Training. Leveraged Cloud Scheduler to schedule pipeline run. Data Versioning is achieved with thehelp of Datasets and Bigquery.
  • Registered Computer Vision Models from GCS Bucket and uploaded it to model registry using Jenkins Pipelines. Leveraged Git and GitHub version control system to create and update the configuration yaml files. Developed PipelineConfig.groovy to set the flow of deployment.
  • Deployed Models at Endpoints in usc1 and usw1 regions. Enabled Request-Response Logging to populate requests and responses with 100% sampling rate. Created BQ Views by dropping the redundant requests.
  • Experimented with “Deepchecks” and “Alibi Detect” to analyse drift and skewness detection for Image Data. Worked on integrating the Model Monitoring Steps through Kubeflow Pipelines.
  • Integrated all the ML Application with SonarQube, Veracode, Uptrends Monitoring and Elastic Analytics Engine. Incident Management Calls handled with the help of Pager Duty.
  • POC: Worked on Imagery Data to built end-to-end pipelines using Vertex AI Workbench and deployed Model to the endpoint.
  • Leveraged GCS bucket to store model.

Cuisine Classification Model

  • Extracted restaurant data fro AWS Redshift database through Kedro orchestration tool. Used variables like latitude, longitude, cuisine type etc for EDA.
  • Built a multi-label, multi-class classification model based on OneVsRestClassifier to predict the cuisine code associated with a particular restaurant and obtained the best accuracy score of 70% on the validation data and hamming loss 0.6%.
  • Cleaned and tokenized the restaurant names and then vectorized the names, generating document-word sparse matrix using TF-IDF Vectorizer and experimented with parameters such as min_df, max_df, token pattern and n-gram range.
  • Experimented with multiple classification algorithms such as Label Powerset, Multi-Learn KNN, LSTM and OneVsRestClassifier.
  • MLOps Pipeline: Created Python virtual environment using venv for the project. Used Kedro Orchestration Framework to create data processing, Data Science and Inference Pipelines. Created catalog.yml to point out the data stored in AWS S3
  • buckets, Redshift and SQL server also created parameters.yml file to save hyperparameters. 
  • Registered the pipelines in Model Registry. Used MLFlow to track the ML Experiments. Used Airflow to create DAGs. Deployed the project on EC2 to get batch

predictions.

  • Business Impact – Credibility gain by providing accurate data our internal team/stake holders/Customers

Predictive Maintenance On Al Carmen

  • Created Python script to Extract-Transform-Loading of the data from server to oracle. Built ML and DL unsupervised algorithms like Isolation forest, Local Outlier Factor(LOF) and LSTM by using Autoenncoders and Autodecoders on multivariate timeseries sensor’s data of machine.
  • Orchestrated the ML Pipelines using Kubeflow Orchestration Framework. Created Components and compiled the pipelines and deployed it using Vertex AI Pipelines. Leveraged Cloud Scheduler to schedule inference batch prediction pipeline every 14 days.
  • Created line chart for each sensor value by using seaborn and matplotlib libraries and and Performed exhaustive EDA and checked the distribution of the data.
  • Experimented with different contamination levels and verify them with critical limits given by client(user).
  • Plotted line chart for each sensor’s data showing anomalies. Created table showing the Anomalies Vs Time in Power BI and published it to be used by client.

 

Credit Risk Management

  • Developed an XGBoost based binary classification model to predict whether a customer will default on a loan and obtained the AUPRC score of 92% on test data.
  • Engineered a new class of attributes known as Decayed Field Variables and developed out-of-pattern variables on historical loan and bureau data to identify risky customers and strengthen the underwriting process.
  • Performed missing value imputation using KNN-imputer, implemented SMOTE boosting to oversample the minority class observations and carried out hyperparameter tuning and Bayesian Optimization.
  • Obtained Model Reason Codes(MRCs) by leveraging the novel concept of SHAP values and SHAP charts such as summary,
  • interaction, and force plots to come up with the best explanation for model predictions.
  • Business Impact- Reduced Non Performing Assets(NPAs) by 2.35% and reduction of 32000 manual working hours yearly thereby increased overall efficiency.

PAPER PUBLICATIONS

E-Commerce Product Recommendation

  • This research uses deep neural networks (DNNs) to improve complementary-product recommendations in e-commerce systems. Utilizing the MovieLens dataset, the study investigates the effectiveness of the Neural Collaborative Filtering (NeuMF) model, which integrates Generalized Matrix Factorization (GMF) and Multi-Layer Perceptron (MLP) components. The study emphasizes the importance of preprocessing the model dataset to ensure data quality and relevance.  
  • Key performance metrics were analyzed to evaluate the model’s performance, including model accuracy, loss, ROC curve, and precision-recall curve. Results indicate that the NeuMF model effectively captures linear and non-linear user-item interactions, improving recommendation accuracy. The findings underscore the model’s potential to enhance user satisfaction and drive sales in e-commerce platforms.

Healthcare Blockchain Data

Recent technological advancements have enabled individuals to interact in various manners using multimedia. The growing use of wearable gadgets, medical detectors, and improved medical facilities has resulted in the continuous production of massive amounts of multimedia data. The application of multimedia techniques in healthcare systems also enables the storage, processing, and transmission of patient information provided in several formats like images, text, and voice via the internet through different smart devices.  

  • Nonetheless, managing and administering this data presents significant challenges, including data safety, confidentiality, adaptability, and interoperability. To address these challenges, there is a growing trend toward adopting blockchain innovation in medical operations. Blockchain, a distributed and immutable record, offers a promising alternative for ensuring the precision, transparency, and security of data in Internet of Things (IoT) medical facilities. By incorporating blockchain innovation into data processing, medical firms may raise shareholder trust, reduce the risk of data tampering, and improve total data management effectiveness. We tested the proposed approach on openly accessible chest X-rays and CT scans. The blockchain strategy resulted in an 87 percent performance ratio over goods drop proportion, falsification, warm-hole assault, and likelihood-based verification situations when compared to traditional methods.

Cloud Response Prediction

A cloud computation platform that acquires complicated user demands with multiple subtasks is evaluated in terms of response time. To shorten servicing duration, operations are broken down into lesser parts and handled simultaneously. Measuring response times in a multilayer network is an important but difficult analytical component in Quality of Service (QoS) evaluation. Modulating the functioning of QoS variables, including response time and throughput, can lead to more effective cloud-based services for customers.  

Modern analytical approaches can provide accurate estimations of average response times. Nevertheless, precisely estimating response time dispersion for service-level evaluation is a difficult issue. Precise estimation of absent QoS data is crucial for proposing acceptable online services to end users, as the QoS variable matrices are typically lacking. This investigation developed an artificial neural network (ANN) framework to forecast absent QoS data for estimating response time. This study compares the performance of various ANN methods for predicting QoS response time. The Bayesian-Regularized ANN method outperforms other learning methods in terms of performance.

AI Algorithms on Big Data

Recent advancements in computing systems and Big Data platforms have facilitated the development of artificial intelligence (AI). AI algorithms have proven to be quite useful in big data processing, particularly data classification. Unstructured data is commonly collected through social networking sites such as Facebook, Instagram, Twitter, and others. Converting unstructured data to structured data is a time-consuming activity. Unstructured data is transformed into structured data using various AI algorithms.  

The unstructured data for this study was initially gathered by using the Instagram application program interface (API) feed from a popular social networking site. Furthermore, utilizing the gathered database different AI algorithms such as neural network (NN), k-nearest network (KNN), super vector machine (SVM), random forest (RF), decision tree (DT), and logistic regression (LR) is applied in this study. The results imply that SVM algorithm performance is higher compared to the other five AI algorithms.

Rajesh More

Trusted by top companies and a growing network of AI professionals, reflecting our commitment to real-world impact.

Why CHOSE US

Expert Mentorship

Hands-On Projects

Small Batch Sizes

Career Support

Real-World AI

Work Hours

We’re here to assist you—reach out during our working hours or contact us anytime online.

© 2023 Created with Digital Partners