Blog

데이터 과학, 인공지능, 딥러닝에 관한 이야기

77개의 포스트

[논문 리뷰] Aligning machine and human visual representations across abstraction levels
2026-01-03Paper Review

[논문 리뷰] Aligning machine and human visual representations across abstraction levels

...

Paper Review
[논문 리뷰] Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration
2026-01-03Paper Review

[논문 리뷰] Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Hyperparameter tuning can dramatically impact training stability and final performance of large-scale models. Recent works on neural network parameterisations, such as $μ$P, have enabled transfer of o...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks
2026-01-03Paper Review

[논문 리뷰] Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

We present the surprising finding that a language model's reasoning capabilities can be improved by training on synthetic datasets of chain-of-thought (CoT) traces from more capable models, even when ...

Paper Review
cs.AI
cs.LG
+1
[논문 리뷰] SemanticGen: Video Generation in Semantic Space
2026-01-03Paper Review

[논문 리뷰] SemanticGen: Video Generation in Semantic Space

State-of-the-art video generative models typically learn the distribution of video latents in the VAE space and map them to pixels using a VAE decoder. While this approach can generate high-quality vi...

Paper Review
cs.CV
cs.CV
[논문 리뷰] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding
2026-01-03Paper Review

[논문 리뷰] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our stud...

Paper Review
cs.CV
cs.CV
[논문 리뷰] Epistemological Fault Lines Between Human and Artificial Intelligence
2026-01-03Paper Review

[논문 리뷰] Epistemological Fault Lines Between Human and Artificial Intelligence

Large language models (LLMs) are widely described as artificial intelligence, yet their epistemic profile diverges sharply from human cognition. Here we show that the apparent alignment between human ...

Paper Review
cs.CY
cs.CL
+1
[논문 리뷰] Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement
2026-01-03Paper Review

[논문 리뷰] Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement

We present MACLA, a framework that decouples reasoning from learning by maintaining a frozen large language model while performing all adaptation in an external hierarchical procedural memory. MACLA e...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] Sophia: A Persistent Agent Framework of Artificial Life
2026-01-03Paper Review

[논문 리뷰] Sophia: A Persistent Agent Framework of Artificial Life

The development of LLMs has elevated AI agents from task-specific tools to long-lived, decision-making entities. Yet, most architectures remain static and reactive, tethered to manually defined, narro...

Paper Review
cs.AI
cs.AI
[논문 리뷰] Distributional AGI Safety
2026-01-03Paper Review

[논문 리뷰] Distributional AGI Safety

AI safety and alignment research has predominantly been focused on methods for safeguarding individual AI systems, resting on the assumption of an eventual emergence of a monolithic Artificial General...

Paper Review
cs.AI
cs.AI
[논문 리뷰] LLaDA2.0: Scaling Up Diffusion Language Models to 100B
2026-01-03Paper Review

[논문 리뷰] LLaDA2.0: Scaling Up Diffusion Language Models to 100B

This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establi...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
2026-01-03Paper Review

[논문 리뷰] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Autoregressive models (ARMs) are hindered by slow sequential inference. While masked diffusion models (MDMs) offer a parallel alternative, they suffer from critical drawbacks: high computational overh...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
2026-01-03Paper Review

[논문 리뷰] Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint aud...

Paper Review
cs.CV
cs.CV
[논문 리뷰] An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges
2026-01-03Paper Review

[논문 리뷰] An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges

Vision-Language-Action (VLA) models are driving a revolution in robotics, enabling machines to understand instructions and interact with the physical world. This field is exploding with new models and...

Paper Review
cs.RO
cs.RO
[논문 리뷰] Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
2026-01-03Paper Review

[논문 리뷰] Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

The landscape of high-performance image generation models is currently dominated by proprietary systems, such as Nano Banana Pro and Seedream 4.0. Leading open-source alternatives, including Qwen-Imag...

Paper Review
cs.CV
cs.CV
[논문 리뷰] ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
2026-01-03Paper Review

[논문 리뷰] ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensiv...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
2026-01-03Paper Review

[논문 리뷰] Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework

Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy-sensitive. Many such generation tasks require coordinate...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] Flow Map Distillation Without Data
2026-01-03Paper Review

[논문 리뷰] Flow Map Distillation Without Data

State-of-the-art flow models achieve remarkable quality but require slow, iterative sampling. To accelerate this, flow maps can be distilled from pre-trained teachers, a procedure that conventionally ...

Paper Review
cs.LG
cs.CV
+1
[논문 리뷰] Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
2026-01-03Paper Review

[논문 리뷰] Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Vision-Language Models (VLMs) excel at reasoning in linguistic space but struggle with perceptual understanding that requires dense visual perception, e.g., spatial reasoning and geometric awareness. ...

Paper Review
cs.CV
cs.AI
+1
[논문 리뷰] CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
2026-01-03Paper Review

[논문 리뷰] CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning

Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge but still suffers from long contexts and disjoint retrieval-generation optimization. In this work, we...

Paper Review
cs.CL
cs.CL
[논문 리뷰] OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
2026-01-03Paper Review

[논문 리뷰] OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists

With the rapid development of Large Language Models (LLMs), AI agents have demonstrated increasing proficiency in scientific tasks, ranging from hypothesis generation and experimental design to manusc...

Paper Review
cs.CY
cs.CE
+1
[논문 리뷰] Evolution Strategies at the Hyperscale
2026-01-03Paper Review

[논문 리뷰] Evolution Strategies at the Hyperscale

We introduce Evolution Guided General Optimization via Low-rank Learning (EGGROLL), an evolution strategies (ES) algorithm designed to scale backprop-free optimization to large population sizes for mo...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] A Primer on Quantum Machine Learning
2026-01-03Paper Review

[논문 리뷰] A Primer on Quantum Machine Learning

Quantum machine learning (QML) is a computational paradigm that seeks to apply quantum-mechanical resources to solve learning problems. As such, the goal of this framework is to leverage quantum proce...

Paper Review
quant-ph
cs.AI
+1
[논문 리뷰] Fine-Tuned LLMs Know They Don't Know: A Parameter-Efficient Approach to Recovering Honesty
2026-01-03Paper Review

[논문 리뷰] Fine-Tuned LLMs Know They Don't Know: A Parameter-Efficient Approach to Recovering Honesty

The honesty of Large Language Models (LLMs) is increasingly important for safe deployment in high-stakes domains. However, this crucial trait is severely undermined by supervised fine-tuning (SFT), a ...

Paper Review
cs.CL
cs.CL
[논문 리뷰] Black-Box On-Policy Distillation of Large Language Models
2026-01-03Paper Review

[논문 리뷰] Black-Box On-Policy Distillation of Large Language Models

Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] DoPE: Denoising Rotary Position Embedding
2026-01-03Paper Review

[논문 리뷰] DoPE: Denoising Rotary Position Embedding

Rotary Position Embedding (RoPE) in Transformer models has inherent limits that weaken length extrapolation. We reinterpret the attention map with positional encoding as a noisy feature map, and propo...

Paper Review
cs.CL
cs.CL
[논문 리뷰] Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
2026-01-03Paper Review

[논문 리뷰] Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Improving reasoning capabilities of Large Language Models (LLMs), especially under parameter constraints, is crucial for real-world applications. Prior work proposes recurrent transformers, which allo...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
2026-01-03Paper Review

[논문 리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Challenging the prevailing consensus that small models inherently lack robust reasoning, this report introduces VibeThinker-1.5B, a 1.5B-parameter dense model developed via our Spectrum-to-Signal Prin...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] World Simulation with Video Foundation Models for Physical AI
2026-01-03Paper Review

[논문 리뷰] World Simulation with Video Foundation Models for Physical AI

We introduce [Cosmos-Predict2.5], the latest generation of the Cosmos World Foundation Models for Physical AI. Built on a flow-based architecture, [Cosmos-Predict2.5] unifies Text2World, Image2World, ...

Paper Review
cs.CV
cs.AI
+1
[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models
2026-01-03Paper Review

[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models

We envision a new era of AI, termed agentic organization, where agents solve complex problems by working collaboratively and concurrently, enabling outcomes beyond individual intelligence. To realize ...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
2026-01-03Paper Review

[논문 리뷰] Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision

Where do learning signals come from when there is no ground truth in post-training? We propose turning exploration into supervision through Compute as Teacher (CaT), which converts the model's own exp...

Paper Review
cs.LG
cs.LG
[논문 리뷰] Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
2026-01-03Paper Review

[논문 리뷰] Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings

The attention mechanism in a Transformer architecture matches key to query based on both content -- the what -- and position in a sequence -- the where. We present an analysis indicating that what and...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
2026-01-03Paper Review

[논문 리뷰] Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Reinforcement Learning (RL) has proven highly effective at enhancing the complex reasoning abilities of Large Language Models (LLMs), yet underlying mechanisms driving this success remain largely opaq...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
2026-01-03Paper Review

[논문 리뷰] From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

Artificial intelligence (AI) is reshaping scientific discovery, evolving from specialized computational tools into autonomous research partners. We position Agentic Science as a pivotal stage within t...

Paper Review
cs.LG
cs.LG
[논문 리뷰] Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
2026-01-03Paper Review

[논문 리뷰] Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

We present AlphaGeometry2 (AG2), a significantly improved version of AlphaGeometry introduced in (Trinh et al., 2024), which has now surpassed an average gold medalist in solving Olympiad geometry pro...

Paper Review
cs.AI
cs.LG
+1
[논문 리뷰] Nested Learning: The Illusion of Deep Learning Architecture
2026-01-02Paper Review

[논문 리뷰] Nested Learning: The Illusion of Deep Learning Architecture

Over the last decades, developing more powerful neural architectures and simultaneously designing optimization algorithms to effectively train them have been the core of research efforts to enhance th...

Paper Review
[논문 리뷰] Act2Goal: From World Model To General Goal-conditioned Policy
2026-01-02Paper Review

[논문 리뷰] Act2Goal: From World Model To General Goal-conditioned Policy

Specifying robotic manipulation tasks in a manner that is both expressive and precise remains a central challenge. While visual goals provide a compact and unambiguous task specification, existing goa...

Paper Review
cs.RO
cs.AI
+1
[논문 리뷰] Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks
2026-01-02Paper Review

[논문 리뷰] Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

Neural network pruning is widely used to reduce model size and computational cost. Yet, most existing methods treat sparsity as an externally imposed constraint, enforced through heuristic importance ...

Paper Review
cs.AI
cs.AI
[논문 리뷰] AgentEvolver: Towards Efficient Self-Evolving Agent System
2026-01-02Paper Review

[논문 리뷰] AgentEvolver: Towards Efficient Self-Evolving Agent System

Autonomous agents powered by large language models (LLMs) have the potential to significantly enhance human productivity by reasoning, using tools, and executing complex tasks in diverse environments....

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] TiDAR: Think in Diffusion, Talk in Autoregression
2026-01-02Paper Review

[논문 리뷰] TiDAR: Think in Diffusion, Talk in Autoregression

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language model...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] AlphaResearch: Accelerating New Algorithm Discovery with Language Models
2026-01-02Paper Review

[논문 리뷰] AlphaResearch: Accelerating New Algorithm Discovery with Language Models

Large language models have made significant progress in complex but easy-to-verify problems, yet they still struggle with discovering the unknown. In this paper, we present extbf{AlphaResearch}, an ...

Paper Review
cs.CL
cs.CL
[논문 리뷰] Attention and Compression is all you need for Controllably Efficient Language Models
2026-01-02Paper Review

[논문 리뷰] Attention and Compression is all you need for Controllably Efficient Language Models

The quadratic cost of attention in transformers motivated the development of efficient approaches: namely sparse and sliding window attention, convolutions and linear attention. Although these approac...

Paper Review
cs.LG
cs.LG
[논문 리뷰] The Curved Spacetime of Transformer Architectures
2026-01-02Paper Review

[논문 리뷰] The Curved Spacetime of Transformer Architectures

We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity. Queries and keys induce an effective metric on representation s...

Paper Review
cs.LG
cs.CL
+1
[논문 리뷰] No-Human in the Loop: Agentic Evaluation at Scale for Recommendation
2026-01-02Paper Review

[논문 리뷰] No-Human in the Loop: Agentic Evaluation at Scale for Recommendation

Evaluating large language models (LLMs) as judges is increasingly critical for building scalable and trustworthy evaluation pipelines. We present ScalingEval, a large-scale benchmarking study that sys...

Paper Review
cs.AI
cs.IR
+1
[논문 리뷰] ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
2026-01-02Paper Review

[논문 리뷰] ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Multimodal reasoning requires iterative coordination between language and vision, yet it remains unclear what constitutes a meaningful interleaved chain of thought. We posit that text and image though...

Paper Review
cs.CV
cs.CV
[논문 리뷰] Context Engineering 2.0: The Context of Context Engineering
2026-01-02Paper Review

[논문 리뷰] Context Engineering 2.0: The Context of Context Engineering

Karl Marx once wrote that ``the human essence is the ensemble of social relations'', suggesting that individuals are not isolated entities but are fundamentally shaped by their interactions with other...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] Chain-of-Thought Hijacking
2026-01-02Paper Review

[논문 리뷰] Chain-of-Thought Hijacking

Large reasoning models (LRMs) achieve higher task performance with more inference-time computation, and prior works suggest this scaled reasoning may also strengthen safety by improving refusal. Yet w...

Paper Review
cs.AI
cs.AI
[논문 리뷰] GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning
2026-01-02Paper Review

[논문 리뷰] GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning

Autonomous agents powered by large language models (LLMs) have shown impressive capabilities in tool manipulation for complex task-solving. However, existing paradigms such as ReAct rely on sequential...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] Reasoning with Sampling: Your Base Model is Smarter Than You Think
2026-01-02Paper Review

[논문 리뷰] Reasoning with Sampling: Your Base Model is Smarter Than You Think

Frontier reasoning models have exhibited incredible capabilities across a wide array of disciplines, driven by posttraining large language models (LLMs) with reinforcement learning (RL). However, desp...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] Cache-to-Cache: Direct Semantic Communication Between Large Language Models
2026-01-02Paper Review

[논문 리뷰] Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Multi-LLM systems harness the complementary strengths of diverse Large Language Models, achieving performance and efficiency gains unattainable by a single model. In existing designs, LLMs communicate...

Paper Review
cs.CL
cs.LG
+1
[논문 리뷰] Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
2026-01-02Paper Review

[논문 리뷰] Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

Large language models (LLMs) have shown remarkable advancements in enabling language agents to tackle simple tasks. However, applying them for complex, multi-step, long-horizon tasks remains a challen...

Paper Review
cs.CL
cs.CL
테스트의 중요성과 구현 방법
2026-01-01General

테스트의 중요성과 구현 방법

소프트웨어 개발 과정에서 "테스트"라는 단어를 듣지 않은 개발자는 아마 없을 것입니다. 그만큼 테스트는 소프트웨어의 신뢰성과 안정성을 보장하는 데 필수적인 역할을 합니다. 특히 데이터 과학과 인공지능 분야에서는 결과의 정확성과 모델의 성능을 검증하기 위해 테스트가 더욱 중요합니다. 이번 포스트에서는 테스트의 중요성을 살펴보고, 테스트를 어떻게 효율적으로 구...

[논문 리뷰] mHC: Manifold-Constrained Hyper-Connections
2026-01-01Paper Review

[논문 리뷰] mHC: Manifold-Constrained Hyper-Connections

Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifyi...

Paper Review
cs.CL
cs.AI
+1
[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models
2026-01-01Paper Review

[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models

We envision a new era of AI, termed agentic organization, where agents solve complex problems by working collaboratively and concurrently, enabling outcomes beyond individual intelligence. To realize ...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] Real Deep Research for AI, Robotics and Beyond
2026-01-01Paper Review

[논문 리뷰] Real Deep Research for AI, Robotics and Beyond

With the rapid growth of research in AI and robotics now producing over 10,000 papers annually it has become increasingly difficult for researchers to stay up to date. Fast evolving trends, the rise o...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] The Free Transformer
2026-01-01Paper Review

[논문 리뷰] The Free Transformer

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experiment...

Paper Review
cs.LG
cs.LG
[논문 리뷰] Training-Free Group Relative Policy Optimization
2026-01-01Paper Review

[논문 리뷰] Training-Free Group Relative Policy Optimization

Recent advances in Large Language Model (LLM) agents have demonstrated their promising general capabilities. However, their performance in specialized real-world domains often degrades due to challeng...

Paper Review
cs.CL
cs.CL
[논문 리뷰] ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
2026-01-01Paper Review

[논문 리뷰] ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

With the growing adoption of large language model agents in persistent real-world roles, they naturally encounter continuous streams of tasks. A key limitation, however, is their failure to learn from...

Paper Review
cs.AI
cs.CL
+1
[논문 리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
2026-01-01Paper Review

[논문 리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

While Transformers have been the main architecture behind deep learning's success in language modeling, state-space models (SSMs) such as Mamba have recently been shown to match or outperform Transfor...

Paper Review
cs.LG
cs.LG
[논문 리뷰] Mamba: Linear-Time Sequence Modeling with Selective State Spaces
2026-01-01Paper Review

[논문 리뷰] Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time a...

Paper Review
cs.LG
cs.AI
+1
[논문 리뷰] Vector database management systems: Fundamental concepts, use-cases, and current challenges
2026-01-01Paper Review

[논문 리뷰] Vector database management systems: Fundamental concepts, use-cases, and current challenges

Vector database management systems have emerged as an important component in modern data management, driven by the growing importance for the need to computationally describe rich data such as texts, ...

Paper Review
cs.DB
cs.DB
데이터 증강을 통한 모델 성능 향상 기법
2026-01-01General

데이터 증강을 통한 모델 성능 향상 기법

인공지능(AI)와 머신러닝(ML) 분야에서 데이터는 가장 중요한 자산입니다. 충분한 양의 고품질 데이터를 확보하는 것은 모델의 성능을 결정짓는 중요한 요소입니다. 그러나 현실에서는 데이터가 부족하거나, 데이터 수집에 많은 비용과 시간이 소요되는 경우가 자주 발생합니다. 이러한 문제를 해결하기 위해 데이터 증강(Data Augmentation) 기법이 주목받...

RAG 시스템 구축: 검색 증강 생성의 원리와 구현
2025-12-31NLP

RAG 시스템 구축: 검색 증강 생성의 원리와 구현

오늘날 인공지능(AI)이 발전하면서 자연어 처리(NLP) 분야에서도 다양한 혁신이 일어나고 있습니다. 그 중 하나가 바로 RAG(Retrieval-Augmented Generation, 검색 증강 생성) 시스템입니다. RAG는 검색과 생성 두 가지 프로세스를 결합하여 보다 정확하고 풍부한 정보를 제공하는 데 중점을 둡니다. 전통적인 NLP 모델이 대규모 데...

PyTorch 텐서 기초
2025-12-31Deep Learning

PyTorch 텐서 기초

딥러닝(Deep Learning)은 현대 인공지능(AI) 기술의 중심에 서 있습니다. 이 중에서도 PyTorch는 연구자와 개발자들 사이에서 인기가 높은 프레임워크로, 그 유연성과 직관적인 인터페이스 덕분에 널리 사용되고 있습니다. PyTorch를 제대로 이해하기 위해서는 텐서(Tensor)라는 기본 단위를 확실히 이해하는 것이 중요합니다. 텐서는 데이터를...

[논문 리뷰] QLoRA: Efficient Finetuning of Quantized LLMs
2025-12-31Paper Review

[논문 리뷰] QLoRA: Efficient Finetuning of Quantized LLMs

We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLo...

Paper Review
cs.LG
cs.LG
[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2025-12-31Paper Review

[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed t...

Paper Review
cs.CL
cs.CL
MLOps 입문: 머신러닝 운영 파이프라인 구축
2025-12-31MLOps

MLOps 입문: 머신러닝 운영 파이프라인 구축

머신러닝(ML)은 오늘날 많은 산업에서 혁신을 주도하고 있습니다. 그러나 머신러닝 모델을 성공적으로 개발하는 것만으로는 충분하지 않습니다. 모델을 실제 운영 환경에 배포하고 모니터링하며 지속적으로 개선하기 위해서는 특별한 노력이 필요합니다. 이 과정에서 MLOps가 중요한 역할을 합니다....

LLM 파인튜닝: LoRA와 QLoRA 기법
2025-12-31NLP

LLM 파인튜닝: LoRA와 QLoRA 기법

대규모 언어 모델(LLM, Large Language Models)은 자연어 처리(NLP, Natural Language Processing) 분야에서 혁신을 이끌어왔습니다. 이러한 모델들은 방대한 양의 데이터를 활용하여 다양한 언어적 과제를 수행할 수 있습니다. 하지만 LLM은 막대한 컴퓨팅 자원을 요구하며, 특정 작업에 맞게 모델을 조정하는 파인튜닝 과...

Diffusion 모델: 이미지 생성 AI의 원리와 활용
2025-12-31Deep Learning

Diffusion 모델: 이미지 생성 AI의 원리와 활용

인공지능(AI) 기술은 최근 몇 년 동안 급격히 발전해 왔으며, 특히 이미지 생성 분야에서 큰 주목을 받고 있습니다. 이러한 발전의 중심에는 'Diffusion 모델'이라는 강력한 기술이 자리잡고 있습니다. Diffusion 모델은 복잡한 패턴을 학습하고, 현실감 넘치는 이미지를 생성하는 데 탁월한 성능을 보이며, 다양한 산업 분야에서 응용되고 있습니다....

딥러닝 기초
2025-12-31Deep Learning

딥러닝 기초

딥러닝(Deep Learning)은 인공지능(AI) 분야에서 가장 주목받는 기술 중 하나로, 여러 분야에서 혁신적인 변화를 이끌고 있습니다. 이미지를 인식하거나 자연어를 처리하는 것과 같은 복잡한 문제를 해결하는 데 있어 딥러닝의 역할은 날이 갈수록 커지고 있습니다. 이 글에서는 딥러닝의 기본 개념을 이해하고, 실제로 어떻게 구현할 수 있는지를 살펴보겠습니...

자연어처리에서의 Word Embedding
2025-12-30NLP

자연어처리에서의 Word Embedding

자연어처리(NLP, Natural Language Processing) 분야에서 Word Embedding은 필수적인 개념 중 하나입니다. 이는 컴퓨터가 인간의 언어를 이해하고 처리할 수 있도록 돕는 중요한 기술입니다. 이번 블로그 포스트에서는 Word Embedding의 기본 개념과 주요 기법들을 살펴보고, Python 코드를 통해 이를 실습해보는 시간을...

트랜스포머 Attention 메커니즘의 이해
2025-12-30Deep Learning

트랜스포머 Attention 메커니즘의 이해

최근 몇 년간 자연어 처리(Natural Language Processing, NLP) 분야에서는 혁신적인 변화가 일어났습니다. 그 중심에는 단연 트랜스포머(Transformer) 모델이 자리 잡고 있습니다. 트랜스포머는 다양한 NLP 작업에서 뛰어난 성능을 보이며, 언어 모델, 번역, 요약, 질의응답 등 여러 응용 분야에서 사용됩니다. 이러한 트랜스포머의...

Transformer
Attention
Deep Learning
+2
강화학습 완벽 가이드: 이론부터 실전까지
2025-12-30Machine Learning

강화학습 완벽 가이드: 이론부터 실전까지

강화학습(Reinforcement Learning)은 인공지능(AI) 분야에서 기계가 스스로 학습하고 결정할 수 있는 능력을 부여하는 중요한 기술입니다. 이 방법론은 로봇 제어, 게임 플레이, 자율 주행 자동차 등 다양한 분야에서 혁신적인 결과를 보여주고 있습니다. 본 가이드에서는 강화학습의 역사, 이론적 배경, 주요 알고리즘, 그리고 실제 구현까지 심층적으로 다룹니다.

Reinforcement Learning
Q-Learning
DQN
+2
[논문 리뷰] Attention Is All You Need
2025-12-30Paper Review

[논문 리뷰] Attention Is All You Need

Transformer 아키텍처를 최초로 제안한 획기적인 논문. RNN과 CNN을 완전히 배제하고 Attention 메커니즘만으로 시퀀스 모델링의 새로운 패러다임을 제시하여, BERT, GPT 등 현대 자연어처리의 기반을 마련했습니다.

Paper Review
Transformer
Attention
+2
Graph Neural Networks 기초
2025-12-30Deep Learning

Graph Neural Networks 기초

최근 들어 인공지능(AI)과 머신러닝(ML)이 다양한 분야에서 혁신을 이루고 있습니다. 그중에서도 그래프 뉴럴 네트워크(Graph Neural Networks, GNN)는 복잡한 구조적 데이터를 효율적으로 처리할 수 있는 강력한 도구로 주목받고 있습니다. 그래프 데이터는 소셜 네트워크, 추천 시스템, 분자 구조 등 다양한 실세계 문제에서 자연스럽게 발생하며...

CNN 이미지 분류: 딥러닝의 핵심 기술
2025-12-30Deep Learning

CNN 이미지 분류: 딥러닝의 핵심 기술

이미지 분류는 컴퓨터 비전 분야에서 가장 중요한 문제 중 하나로, 자율주행차의 장애물 인식, 의료 영상 분석, 소셜 미디어의 이미지 태그링 등 다양한 분야에 응용되고 있습니다. 특히, CNN(Convolutional Neural Networks, 합성곱 신경망)은 이미지 데이터 처리에서 탁월한 성능을 보여주며, 딥러닝 혁신의 중심에 서 있습니다. 이번 블로...

CNN
Deep Learning
Computer Vision
+2
BERT 완벽 가이드: 자연어처리의 혁명적 모델
2025-12-30NLP

BERT 완벽 가이드: 자연어처리의 혁명적 모델

BERT(Bidirectional Encoder Representations from Transformers)는 2018년 구글이 발표한 혁신적인 자연어처리 모델입니다. 양방향 문맥 이해를 통해 NLP 분야에 새로운 패러다임을 제시한 BERT의 아키텍처, 학습 방법, 실전 활용법까지 상세히 알아봅니다.

BERT
NLP
Transformer
+2
SuanLab 블로그에 오신 것을 환영합니다
2024-12-27General

SuanLab 블로그에 오신 것을 환영합니다

데이터 과학, 인공지능, 딥러닝에 관한 다양한 이야기를 공유하는 SuanLab 블로그입니다.

Welcome
SuanLab
Blog