惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

W
WeLiveSecurity
D
DataBreaches.Net
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
T
The Exploit Database - CXSecurity.com
D
Darknet – Hacking Tools, Hacker News & Cyber Security
腾讯CDC
PCI Perspectives
PCI Perspectives
阮一峰的网络日志
阮一峰的网络日志
S
Security Archives - TechRepublic
Hugging Face - Blog
Hugging Face - Blog
U
Unit 42
IT之家
IT之家
T
Troy Hunt's Blog
P
Proofpoint News Feed
www.infosecurity-magazine.com
www.infosecurity-magazine.com
F
Full Disclosure
V
V2EX
Stack Overflow Blog
Stack Overflow Blog
C
Comments on: Blog
V
Vulnerabilities – Threatpost
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
V2EX - 技术
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
N
News | PayPal Newsroom
MyScale Blog
MyScale Blog
Google DeepMind News
Google DeepMind News
Application and Cybersecurity Blog
Application and Cybersecurity Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
李成银的技术随笔
P
Privacy & Cybersecurity Law Blog
大猫的无限游戏
大猫的无限游戏
V
Visual Studio Blog
T
ThreatConnect
WordPress大学
WordPress大学
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA
Recent Announcements
Recent Announcements
Google DeepMind News
Google DeepMind News
SecWiki News
SecWiki News
Recorded Future
Recorded Future
小众软件
小众软件
K
Kaspersky official blog
T
Tor Project blog
Last Week in AI
Last Week in AI
GbyAI
GbyAI
人人都是产品经理
人人都是产品经理
Jina AI
Jina AI
S
SegmentFault 最新的问题
MongoDB | Blog
MongoDB | Blog
Simon Willison's Weblog
Simon Willison's Weblog

DEV Community

Choosing the Right Treasure Map to Avoid Data Decay in Veltrix Migrating to Apache Iceberg: Strategies for Every Source System Stop Reviewing Every Line of AI Code - Build the Trust Stack Instead Should you use Gemma 4 for your Development? A Multiversal Analysis to Determine if Gemma 4 is Right for You! The Rising Trend of Creative Interview Questions in Tech I Spent Hours Fighting a Silent Subnet Conflict to Build an Isolated ICS Security Lab (And What It Taught Me About the Linux Kernel) It Worked When I Closed the Laptop. I Swear. We Built an Agent That Flags Fake Internships #kryx Your Personal AI Stack Is the New Dotfiles Your LLM Bill Is Exploding Because of Architecture, Not Pricing -- Here's the Fix How We Prevent Attendance Fraud Using GPS Verification AI Code Review in 2026: How the Tools Actually Differ (A Builder's Field Guide) From Problems to Patterns: Generative AI in .Net (C#) GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4) Building an Amazon EKS Security Baseline Hands-On with Apache Iceberg Using Dremio Cloud 🤫 Firebase Is Quietly Preparing for an Offline-First AI Future Should Angular Apps Still Rely on RxJS in 2025? Gaslighting Gemma 4: Can Open-Weight Reasoning Models Withstand a Confident Liar? AI Workflow Automation Needs More Than Another Script Reviving Cineverse: From Local Storage to Firebase 🚀 Approaches to Streaming Data into Apache Iceberg Tables How to Add Rounded Corners to an Image Online The subtle impact of AI (&amp; IT) on jobs Made a Rust based AI agent Your AI is not bad, your instructions are What Clicked for Me After Building on Solana for a Few Days WhatsApp's Encryption Stack: What It Covers, What It Doesn't, and What a Federal Agent Spent 10 Months Investigating Building CogniPlan: A Local-First Task Planning System Using Apache Iceberg with Python and MPP Query Engines How I Built AegisDesk: A Zero-Token Semantic IT Agent with <5ms Latency I built CodeArchy: an open-source that turns any codebase into a visual, explainable architectural experience, powered by Gemma 4. The Day Our Bot Ran Out of Money How we're using Gemini Embeddings to build a smarter, community-driven feed on DEV The Speculative Decoding Pattern The PKCE "Gotcha" in Expo’s exchangeCodeAsync TharVA : Keeping India's Desert Heritage Alive with Offline AI (Gemma4) n8n for Healthcare: 5 Automations for Clinics, Practices, and Health Tech Teams (Free Workflow JSON) How I Built an OWASP Memory Guard for AI Agents (ASI06) Condition-Based vs Time-Based Maintenance: Making the Switch I Tested Spam Protection on Formspree vs Formgrid. The Results Were Surprising. May 27 - Video Understanding Workshop Beyond Keywords: How Google's 2026 Algorithms are Redefining SEO From Click to Cart: Ensuring an Accessible Customer Journey in WooCommerce Your company won't replace you with good AI. They'll replace you with bad AI. How to Use an SVG Icon Search Engine as a Claude Custom Connector O fim do “modelo que faz tudo”? Conheça o Conductor, a IA que orquestra outras IAs 10 First-Principles Strategies to Learn Any Programming Language Deeply 10 First-Principles Strategies to Learn Any Programming Language Deeply Understanding Embeddings easily. The Hidden Cost of “Move Fast and Break Things” Why Your Logs Are Useless Without Traces DressCode: Your AI Stylist for Tomorrow The Documented Shortcoming of Our Production Treasure Hunt Engine I'm 16, and I Built an AI Tool That Audits Your Technical Debt Without Ever Touching code Building Your Own Crypto Poker Bot: A Developer's Guide to Blockchain Gaming Logic Apache Iceberg Metadata Tables: Querying the Internals Hermes, The Self-Improving Agent You Can Actually Run Yourself Unity vs Unreal: 5 Things I Had to Relearn the Hard Way Building Agentic Commerce Infrastructure: Overcoming SQLite Concurrency for Autonomous Procurement Agents Solana Accounts vs Databases HTML Table Borders I built a skill that makes AI-generated AWS diagrams actually usable My first post! I'm kinda excited The Page Root Was the Wrong Unit How to audit what your IDE extension actually sends to the cloud I Migrated 23 Make.com Scenarios to n8n and Cut My Bill by 60% — Complete Migration Guide (2026) Solving a Logistics Problem Using Genetic Algorithms Claude Code Skills Explained: What They Are & When to Use Them (2026) Maintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup Zero-Idle Local LLMs: Running Llama 3 in AWS Lambda Containers We scanned 8 B2B SaaS companies across 5 categories. ChatGPT named the same 12 brands in every answer. How To "Market" Yourself As A Tech Pro We scanned 500 MCP servers on Smithery. Here is what we found. HTML Basics for Beginners – Markup Language, Elements and Types of CSS DiffWhisperer: How I Turned Cryptic Git Diffs into Architectural Stories with Gemma 4 I built a version manager for llama.cpp using nothing but vibe coding. Unit Testing vs System Testing: Key Differences, Use Cases, and Best Practices for 2026 A game design textbook explains why products with fewer features win How to Build a Raydium Launchpad Bonding Curve in 5 Minutes with forgekit How to turn an AI prototype into a production system How Data Lake Table Storage Degrades Over Time Partition and Sort Keys on DynamoDB: Modeling data for batch-and-stream convergence Auto-Generate Optimized GitHub Actions Workflows For Any Stack With This New CLI Tool Unchaining the African Creator Economy The Treasure Hunt Engine Gotcha - A Lesson in Constrained Performance great_cto v2.17 - no more tambourine dance When Catalogs Are Embedded in Storage SafeMind AI: Instant Health & Safety Intelligence What Is PKCE, How It Works & Flow Examples AI Agent Failure Modes Beyond Hallucination Fastest Way to Understand Stryker Solana Accounts Explained to a Web2 Developer TV Yayın Akışı Sitesi Geliştirirken Öğrendiğim Teknik Dersler $500 Challenge Drop My First Look at Google's Gemma 4: A Quick Introduction How I use an LLM as a translation judge Best Calendar and Scheduling API for Developers — 2026 Comparison Agentic AI in Travel: Why UCP Isn't Travel-Ready Yet — and What We Measured I Finished Machine Learning. And Then Changed The Plan.
Implementation of AI in mobile applications: Comparative analysis of On-Device and On-Server approaches on Native Android and Flutter
Ratratatyu · 2026-05-23 · via DEV Community

Hi everyone! Today I want to share practical experience in integrating machine learning models into mobile ecosystems. I recently completed the research and development of two MVP applications (on Native Android and Flutter), defended this project at an international conference and now I want to share my integration experience with you.

In this article, we will analyze in detail the difference between local and server AI computing, compare the implementation features of the native layer on Kotlin and the cross-platform layer on Dart, and also analyze non-obvious bugs that you may encounter when working with the camera and file system.

If you are interested in a specific platform, you can skip directly to the relevant section in the navigation below. The source code of both projects is open under the MIT license, links to repositories (GitHub/GitLab) can be found below.

Navigation:

Part 1. On-Device vs On-Server: Architectural Choice

Выбор места деплоя мобильной модели — это всегда компромисс между ресурсами устройства, точностью вычислений и требованиями к безопасности.

Local approach (On-Device)

The model is deployed directly in the application sandbox and carries out inference (inference) locally, using the computing power of the CPU, GPU or specialized neural processors (NPU/TPU) of the smartphone. As an On-Device solution, I used Google ML Kit (Image Labeling SDK).

Pros:

  • Minimum ping (Latency): There are no network delays for the transmission of heavy media data.
  • Autonomy: Complete independence from the availability and quality of the Internet connection.
  • Confidentiality (Data Privacy): User data does not leave the device and is not transferred to third parties.

Cons:

  • Resource limitation: To prevent the application from burning out the battery and taking up gigabytes of memory, models are heavily quantized and cut down.
  • Reduced accuracy: Due to compression, the accuracy of lightweight models decreases (on average, the upper threshold of accuracy in basic classification tasks is about 61-65%)

Server approach (On-Server)

The model lives on a remote server and is accessible via the API. In my research, I used the Hugging Face Inference API (model google/vit-base-patch16-224).

Pros:

  • High accuracy: You can deploy heavy State-of-the-Art (SOTA) models, LLMs, or huge ensembles of neural networks with a colossal class base on the server.
  • Client unloading: The smartphone only fulfills the network request, does not heat up and does not waste power on complex mathematical calculations.

Cons:

  • Network dependence: No Internet - no AI.
  • Infrastructure and security costs: It is necessary to provide encryption of communication channels (TLS/SSL), protect API keys from reverse engineering and pay for server capacity

Part 2. Native implementation: Android

In a native application, the key task is to isolate heavy operations from the main interface thread (Main Thread) to avoid UI friezes and Application Not Responding (ANR) errors.

Working with API (On-Server)

To send POST requests (transmitting an image byte array), a combination of OkHTTP and Retrofit was used. Conversion of the server's JSON response into strictly typed Kotlin data classes occurs automatically thanks to converters. The network call is encapsulated in a suspend function, which allows declarative control of asynchrony.

private const val TOKEN = "..."

data class PredictionResponse(val label: String, val score: Float)

interface HuggingFaceApi {

    @POST
    suspend fun postImage(
        @Url url: String,
        @Header("Authorization") token: String,
        @Body body: RequestBody
    ): List<PredictionResponse>
}
class ApiModel(){
    private val retrofit = Retrofit.Builder()
        .baseUrl("https://router.huggingface.co/hf-inference/")
        .addConverterFactory(GsonConverterFactory.create())
        .build()

    val service: HuggingFaceApi = retrofit.create(HuggingFaceApi::class.java)

     suspend fun classifyImage(imageBitmap: Bitmap): Pair<String, Float>{
        return try {
            val imageToButeArray = compressBitmap((imageBitmap))
            val requestBody = imageToButeArray.toRequestBody("image/jpeg".toMediaTypeOrNull())

            val result = service.postImage(
                "models/google/vit-base-patch16-224",
                "Bearer $token",
                requestBody
            )
            val firstLabel = result.firstOrNull()
                ?: return "Не распознано" to 0f

            firstLabel.label to firstLabel.score



        }catch (e: Exception){
            Log.e("network", "Request failed", e)
            "Oшибка сервера" to 0f

        }
    }

}

Enter fullscreen mode Exit fullscreen mode

Please note that before sending the image to the server we compress it and that we do not send too many bytes over the network

suspend fun compressBitmap(bitmap: Bitmap): ByteArray = withContext(Dispatchers.IO) {

    val stream = ByteArrayOutputStream()

    bitmap.compress(
        Bitmap.CompressFormat.JPEG,
        80,
        stream
    )

     stream.toByteArray()
}

Enter fullscreen mode Exit fullscreen mode

Local inference (On-Device via ML Kit)

To work with the local ML Kit model, the library is configured via ImageLabelerOptions. We explicitly set setConfidenceThreshold(0.4f) - the model’s confidence threshold. By increasing this threshold, we cut off false positives, but force the algorithm to work more intensively.

To ensure stability and save RAM, the labeler object is initialized through the Kotlin delegate mechanism (by lazy):

private val labeler by lazy {
    val options = ImageLabelerOptions.Builder()
        .setConfidenceThreshold(0.4f)
        .build()
    ImageLabeling.getClient(options)
}

Enter fullscreen mode Exit fullscreen mode

Why is by lazy here?

  1. Saving resources: An instance of the heavy ML Kit client is created not when the Activity is launched, but strictly at the time of the first request (when the user takes a photo).
  2. Context safety: Initialization is guaranteed to occur when the applicationContext is already fully formed by the operating system, which prevents NullPointerException from occurring.

The call to labeler.process(image) is asynchronous in nature (runs on Google's Task API). To make it linear and MVVM-friendly, we wrap it with coroutines and wait for the execution result.

Architectural layer and flow control

In MainViewModel, all calls are wrapped in viewModelScope.launch. Depending on the position of the state switch (On-Device / On-Server), the required method is launched:

private fun classifyImage(bitmap: Bitmap?) {
        if (bitmap == null) return
        _uiState.update {
            it.copy(isLoading = true)
        }
        viewModelScope.launch {
            val startTime = System.currentTimeMillis()

            try {
                val (label, confidence) = if (_uiState.value.isOnDevice) {
                    mlKit.analyze(bitmap!!)
                } else {
                    apiModel.classifyImage(bitmap!!)
                }

                val duration = System.currentTimeMillis() - startTime

                _uiState.update {
                    it.copy(
                        classificationText = label,
                        confidenceValue = confidence,
                        timeTakenDuration = duration,
                        isLoading = false
                    )
                }
            } catch (e: Exception) {
                _uiState.update {
                    it.copy(
                        classificationText = "Ошибка",
                        confidenceValue = 0f,
                        timeTakenDuration = 0L
                    )
                }
            }
        }

Enter fullscreen mode Exit fullscreen mode

Working with the camera on Native Android

On the Native Android side, working with the camera looks concise thanks to the modern SDK CameraX. This is a Lifecycle-aware library: it knows when an Activity is minimized (onPause) or destroyed (onDestroy), and automatically releases camera resources and closes streams (ImageAnalysis / ImageCapture). We do not need to manually write the onDispose logic, and the result of a successful snapshot in the code can be a ready-made Bitmap object held in RAM, which eliminates unnecessary disk read-write operations.

Part 3. Cross-platform implementation: Flutter (Dart, Dio, Method Channels)

The Flutter application conceptually solves the same problems, but faces the specifics of Dart’s single-threaded architecture (Event Loop).

Network Inference (Dio + Futures)

To communicate with Hugging Face on Flutter, we used the Dio package. To prevent a heavy request and network packet processing from blocking the rendering of UI frames (after all, Dart runs on a single thread), we package the call into an asynchronous Future/Await model. While the network is chasing bytes, Event Loop calmly continues to render the interface.

final dio = Dio();

Future<List<dynamic>?> apiModel(String path) async {
  final Uint8List? imageBytes = await compressImage(path);

  if (imageBytes == null) {
    return null;
  }

  try {
    final response = await dio.post(
      "https://router.huggingface.co/hf-inference/models/google/vit-base-patch16-224",
      data: imageBytes,
      options: Options(
        headers: {
          "Authorization": "Bearer 'your_token'", // put your token from hugging face here
          "Content-Type": "image/jpeg",
        },
      ),
    );
    return response.data;


  } on DioException catch (e) {
    debugPrint("Error: $e");

  }
  return null;
}

Enter fullscreen mode Exit fullscreen mode

Please note that before sending the image to the server we compress it and that we do not send too many bytes over the network

Future<Uint8List?> compressImage(String path) async {
  final Uint8List? result = await FlutterImageCompress.compressWithFile(
    path,
    quality: 80,
    format: CompressFormat.jpeg,
  );

  return result;
}

Enter fullscreen mode Exit fullscreen mode

Native bridge: MethodChannel for ML Kit

Since there is no full-fledged direct SDK for ML Kit Image Labeling on Dart that provides the required level of customization, a Production approach is used: creating a MethodChannel (native bridge).

The Dart code acts as a client: it generates the predictOnDevice event and passes the path to the saved photo through the channel.

class NativeMlService {
  static const MethodChannel _channel = MethodChannel("mlkit_photo_analyze");

  static Future<Map> onDeviceMethod(String imagePath) async {
    final result = await _channel.invokeMethod(
      'imageLabeling',
      {'imagePath': imagePath},
    );
    return Map.from(result);
  }
}

Enter fullscreen mode Exit fullscreen mode

On the Android side (MainActivity.kt) we catch this call through setMethodCallHandler. The same rules apply here: we deploy the coroutine on a background thread, process the image via ML Kit, but we transmit the response to result.success() strictly returning to the Main Thread, since the Flutter engine will not be able to accept data from the Android side thread.

 override fun configureFlutterEngine(flutterEngine: FlutterEngine) {
        super.configureFlutterEngine(flutterEngine)

        MethodChannel(flutterEngine.dartExecutor.binaryMessenger, CHANNEL).setMethodCallHandler { call, result ->
            if(call.method == "imageLabeling"){
                val imagePath = call.argument<String>("imagePath")
                if (imagePath == null) {
                    result.error("ArgError", "Image path is null", null)
                    return@setMethodCallHandler
                }
                // Run ML inference on background thread to avoid blocking UI
                CoroutineScope(Dispatchers.IO).launch {
                    try {
                          // image processing and model calling....

                        // Return result on main thread
                        withContext(Dispatchers.Main){
                            result.success(response)
                        }
                    }....
//rest of the code on GitHub/ GitLab....

Enter fullscreen mode Exit fullscreen mode

Camera in Flutter and Data Race (Race Condition)

The most difficult and interesting stage of developing the Flutter version was the integration of the camera plugin and debugging the interaction of file systems. Here two important differences from the native were revealed:

Manual Lifecycle Management: In Flutter, the developer must manually initialize the CameraController, catch available lenses (by selecting CameraLensDirection.back) and, most importantly, be sure to call _controller?.dispose() in the dispose() method of the widget.

If you forget, the camera will remain locked in the operating system, and other applications will not be able to open it.

Ghost File Problem (Race Condition):
The _controller?.takePicture() method in Flutter returns an XFile object that physically stores the snapshot in the device cache directory (image.path). This is where the classic data engineering race comes into play.

When Flutter happily reports that the photo has been taken and passes the path to the native code via MethodChannel, the native part (Kotlin) instantly tries to execute BitmapFactory.decodeFile(imagePath). But at the level of the Android operating system, the file in the cache may still be blocked - the stream of data writing from the camera buffer to the disk has not yet had time to physically close.

This was reflected in the logs as a hard crash:
E/ple.flutter_mvp: FrameInsert open fail: No such file or directory
The native code crashed, Bitmap returned null, and Flutter received an empty null reference instead of a data structure.

We get a similar error when we send a picture to the server because we are practically sending an empty picture

Solution to the problem:
To eliminate this data race, two-way protection was applied:

On the Dart (Provider) side: Before calling the native method/sending to the server, we artificially let the system “breathe out” by adding a micro-delay:

await Future.delayed(const Duration(milliseconds: 200));

This time is enough for the OS to complete disk operations.

Conclusion and conclusions

The conducted MVP study clearly proves: On-Device and On-Server approaches do not compete, but complement each other.

  • On-Server is indispensable for heavy computing (LLM, GPT, high-definition video processing).
  • On-Device is ideal for utilitarian tasks (scanning documents, recognizing simple objects, working in strict offline conditions).

In modern Production applications, the best practice is a hybrid approach**: fast primary output is done locally, and deep data validation is sent to the backend server.

Regarding the choice of platform: Native Android gives absolute control over resources, hardware and threads out of the box. Flutter, despite the limitations of single-threading Dart, with proper use of MethodChannel, compliance with the rules for dispatching coroutines in the native layer and taking into account file system timings, allows you to create responsive and productive AI applications.

GitHub

GitHub logo RatRatatyu / mobile-ai-mvp

Two MVP applications demonstrating on-device and on-server AI model integration in Jetpack Compose (Android) and Flutter.

Mobile AI Integration: On-Device vs On-Server MVP Comparison

This repository contains two MVP applications developed for the International Scientific and Practical Conference
"Student Research: Challenges and Development Trends".


🏫 Conference Information

  • Event: International Scientific and Practical Conference
    "Student Research: Challenges and Development Trends"

  • Organizers:
    Ministry of Education of the Republic of Kazakhstan,
    Department of Education of Aktobe Region,
    Aktobe Higher Humanitarian College,
    National Centre for Professional Development "Orleu"

  • Section:
    Science, Technology, and Digital Innovations

  • Date:
    May 22, 2026


📱 Project Overview

The project explores the architectural choice between running AI models directly on a smartphone (On-Device) versus processing them on a remote server (On-Server)

For this research, image classification was chosen as the primary use case to demonstrate the differences in performance and accuracy

🤖 Applied Models

On-Device: Powered by the ML Kit Image Labeling API from Google for local, real-time inference

On-Server: Powered by the Hugging Face google/vit-base-patch16-224…

GitLab

RatRatatyu / mobile-ai-mvp · GitLab

Two MVP applications demonstrating on-device and on-server AI model integration in Jetpack Compose (Android) and Flutter.

favicon gitlab.com

I would like to note that I am just developing and learning in this direction, so perhaps my conclusions may be inaccurate, or the descriptions may not be entirely correct, so I will be grateful if you point out my mistakes in the comments, and I will also be glad if you put stars in GitHub and GitLab if the projects are useful to you