K Flow deploys and manages custom, end-to-end MLOps and analytics platforms in a fraction of the time and a fraction of the cost. We integrate best-in-class open source components, empowering your data science team with the workflows they want.Get in touch
Skip the years-long process of building from scratch. K Flow deploys your platform in a matter of minutes.
The fastest way to access a feature complete platform with dozens of deeply integrated modules available and customized as needed.
All components are purely open source, and every integrated platform is deployed on Kubernetes to provide scalability, reliability, and portability.
Skyler is a foremost expert on architecture of AI / ML infrastructure on Kubernetes. Prior to founding K Flow, Skyler was a Distinguished Technologist at HPE (and MapR), where he was Chief Architect for ML, Analytics, and Data on Kubernetes. Skyler’s dozens of successful deployments include systems for some of the world's largest automakers, financial services firms, and medical device companies. Earlier in his career, Skyler founded the ML startup, QuantifiedSky, to explore whether emerging Deep Nets could be used to predict health anomalies from wearable fitness devices such as heart rate monitors. Before QuantifiedSky, Skyler was an STSM and Chief Architect in IBM’s Software Lab. At IBM, Skyler worked with most of the Fortune 500 at one time or another. Skyler successfully deployed hundreds of massive scale systems some including ML technologies including collaborative filtering for personalization and early Watson.
Prior to founding K Flow, Misha was the Founder CEO of EnvoyAI, the first marketplace for machine learning algorithms for medical imaging applications. EnvoyAI saw rapid traction integrating trained algorithms into the highly secure IT infrastructure of radiology departments. Misha sold EnvoyAI to TeraRecon in November of 2018. The acquired assets are still the ML/AI offering in the TeraRecon portfolio (since rebranded as Eureka Clinical AI). Misha is launching K Flow as his second machine learning platform company after a year of research and validation within Primary Venture Partners. He has the support of the Primary fund, and a deep network of investors, advisors, and engineering talent already bought into the K Flow vision.
We leverage our large portfolio of open source technologies and our modular architecture to build an infrastructure that supports your specific needs.
Constructing good features is critical to creating useful models. A world-class feature store can greatly simplify the work of data scientists. K Flow’s platform includes the best-in-class open-source feature store, Feast. K Flow also integrates with existing Delta Lake or Iceberg feature stores. K Flow can accelerate your feature engineering using Pandas or Spark DataFrames or with SQL via Spark or dbt. Additional training examples can be programmatically created via Snorkel, and we can accelerate the integration of labeling with tools including Label Studio and Amazon’s Mechanical Turk.
Notebooks are an incredibly powerful tool to help data scientists get their work done quickly. However, many notebook tasks are better transformed into pipelines or workflows for repeatability and operationalization. K Flow can provide a platform with your choice of pipeline engine, including the best-in-class Kubeflow Pipelines, the light-weight Metaflow, MLflow Pipelines, or the popular data-focused Airflow. K Flow integrates the Elyra plugin into our notebook containers for quick transformation of notebooks into pipeline tasks for Kubeflow Pipelines and Airflow.
Model serving and inference are an afterthought for many ML platforms. K Flow treats inference and serving with the care they deserve. K Flow can deploy either the stable and trusted Seldon Core or the new and increasingly popular KServe as part of your platform. With either of these inference frameworks, you get powerful features including Inference Graphs, A/B testing, and Canary Deployments. These frameworks have deep integration into model servers such as NVIDIA Triton and Seldon ML Server. They can connect to your online Feast Feature Store, or, if you prefer to serve your models in a different environment, K Flow can also deploy them to external frameworks such as SageMaker or Vertex AI.
Models need to be monitored in production. Concept drift can occur at any time. You may want to alert data scientists of issues or automatically have pipelines retrain models when drift occurs. K Flow provides deep integration between inference frameworks to make model monitoring easier, including several drift detection libraries from Alibi or AI 360. K Flow also integrates libraries that examine fairness, provide model explainability, and detect adversarial robustness.
Concept drift is only one part of the monitoring equation. Monitoring your data pipelines for data quality issues and data drift is also important. The K Flow platform can install and integrate tools such as Great Expectations and Soda into your data pipelines and feature store.
While some machine learning use cases benefit from full automation, others require a human in the loop to evaluate various model scores and charts. A model tracker and registry can provide the visualizations needed for a data scientist to understand various experiments and compare models and runs. K Flow can install and integrate tracking and registration of your models with MLflow or TensorBoard. It can also push experiment results and metadata to external registries such as Weights & Biases, Neptune, SageMaker, Vertex AI, or Verta.
Models can be trained in a notebook with any of a number Python frameworks including TensorFlow, PyTorch, scikit-learn, and XGBoost. Linear models, trees, and neural nets can all be trained on CPUs, GPUs, and TPUs. The K Flow platform allows you to train directly in a notebook or distribute training jobs across machines using the Kubeflow Training Operator and Fairing Framework. Add-on libraries for base frameworks including Keras for TensorFlow, HuggingFace Transformers, or PyTorch Lightning for PyTorch are also supported. Models can be automatically compiled into the ONNX format as a pipeline step to remove the need for server container dependency matching. Scheduling support is provided for CUDA and Rapids.
Notebooks are the primary tool that Data Scientists use during their day. Exploratory data analysis, feature engineering, and model training experiments primarily occur inside notebooks. Notebook containers should be easily and securely launched with the necessary dependencies built in. Notebook files should be easily shared among members of your team. K Flow provides the Kubeflow Notebook Operator and a set of base notebook containers for this task. This operator supports JupyterLab, R Studio, and VS Code notebooks. K Flow works with you to customize base containers with the libraries and dependencies that are most important for your team. We also integrate with a number of external notebooks solutions including DeepNote, SageMaker Studio, Vertex AI Workbench, SaturnCloud, and Databricks Notebooks.
Sometimes it’s easier to use an ML framework to select hyperparameters for your model. K Flow ships with the Katib AutoML training engine. K Flow can also integrate with various python AutoML libraries like PyCaret, Auto-sklearn. AutoML solutions from vendors like H2O or DataRobot are also available. All of these AutoML training options, including Katib, can be added into your pipelines and function with the rest of your platform.
K Flow supports a wide variety of data sources including data warehouses, data lakes, and databases. Snowflake, BigQuery, Redshift, PostgreSQL, Spark, and Presto (Trino) can be directly ingested into a Feast feature store. In other cases, you may want to use Spark Structured Streaming, Airflow, Dataiku or Prefect to ingest data from external sources into the platform.
Capturing the full lineage of your datasets and training code is critical to supporting your ability to recreate any experiment, as well as for regulatory and auditing purposes. K Flow makes this lineage information visible in both your model registry and your inference services. Code lineage is easily tracked via a git integration into notebooks. The K Flow platform’s Feast feature store integration provides built in lineage capabilities. DVC is also available for structured or unstructured datasets that are not available in the feature store. K Flow can also provide pipeline tasks that integrate with an external lineage solution such as Pachyderm.
K Flow bakes security best practices into your platform. K Flow is designed with security in mind from the ground up including: industry standards and protocols such as OIDC and JWT, well-defined Kubernetes namespaces and RBACs, and cloud vendor IAM.
K Flow’s design can help you more easily meet regulatory standards. We ensure your platform meets stringent standards for encryption at-rest and in-transit, the principle of least privilege, audit logging, key/password/secret management, user authentication, PII/PHI handling, code security and reviews, code scanning, non-repudiation, data integrity, backups, disaster recovery, and network segmentation. This supports your certification for standards including GDPR, HIPAA, HITRUST, FDA GxP, FIPS, and FEDRAMP
All K Flow components are provided as Infrastructure as Code. K Flow does not install anything manually. Declarative Terraform scripts are provided to create or alter your cloud environment (this may include Account, IAM, VPC, object and block storage, databases, and managed Kubernetes environment modules). ArgoCD is then used to install various Helm and Kustomize packages. Terraform and ArgoCD combine to form a completely GitOps-based Infrastructure as Code solution.