Blog

데이터 과학, 인공지능, 딥러닝에 관한 이야기

163개의 포스트

2026-02-23•General

인간 행동의 복잡성을 이해하기 위한 새로운 접근법

현대 사회에서 인간 행동의 복잡성은 데이터 과학과 인공지능(AI) 연구의 중요한 주제가 되고 있습니다. 특히, 이러한 연구는 우리의 일상 생활과 밀접하게 연결되어 있으며, 사회적 상호작용부터 의사 결정에 이르기까지 다양한 분야에 영향을 미칩니다. 최근 Nature와 같은 저명한 학술지에 게재되는 연구들은 이 주제에 대한 새로운 통찰을 제공합니다. 본 블로그...

2026-02-23•Paper Review

[논문 리뷰] Reasoning to Learn from Latent Thoughts

Compute scaling for language model (LM) pretraining has outpaced the growth of human-written texts, leading to concerns that data will become the bottleneck to LM scaling. To continue scaling pretrain...

Paper Review

cs.LG

cs.AI

2026-02-22•Paper Review

[논문 리뷰] Reinforcement Learning via Self-Distillation

Large language models are increasingly post-trained with reinforcement learning in verifiable domains such as code and math. Yet, current methods for reinforcement learning with verifiable rewards (RL...

Paper Review

cs.LG

cs.AI

2026-02-21•Paper Review

[논문 리뷰] Unified Latents (UL): How to train your latents

We present Unified Latents (UL), a framework for learning latent representations that are jointly regularized by a diffusion prior and decoded by a diffusion model. By linking the encoder's output noi...

Paper Review

cs.LG

cs.CV

2026-02-21•Paper Review

[논문 리뷰] One-step Language Modeling via Continuous Denoising

Language models based on discrete diffusion have attracted widespread interest for their potential to provide faster generation than autoregressive models. In practice, however, they exhibit a sharp d...

Paper Review

cs.CL

cs.AI

2026-02-21•Paper Review

[논문 리뷰] Towards a Science of AI Agent Reliability

AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress, many agents still continue to fail in practice. This discrep...

Paper Review

cs.AI

cs.CY

2026-02-21•Paper Review

[논문 리뷰] Learning to Learn from Language Feedback with Social Meta-Learning

Large language models (LLMs) often struggle to learn from corrective feedback within a conversational context. They are rarely proactive in soliciting this feedback, even when faced with ambiguity, wh...

Paper Review

cs.CL

cs.AI

2026-02-21•Paper Review

[논문 리뷰] MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks

Existing evaluations of agents with memory typically assess memorization and action in isolation. One class of benchmarks evaluates memorization by testing recall of past conversations or text but fai...

Paper Review

cs.CL

2026-02-21•Paper Review

[논문 리뷰] Long-Tail Knowledge in Large Language Models: Taxonomy, Mechanisms, Interventions and Implications

Large language models (LLMs) are trained on web-scale corpora that exhibit steep power-law distributions, in which the distribution of knowledge is highly long-tailed, with most appearing infrequently...

Paper Review

cs.CL

cs.AI

2026-02-21•Paper Review

[논문 리뷰] Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks

In this work, we propose a notion of practical learnability grounded in finite sample settings, and develop a conjugate learning theoretical framework based on convex conjugate duality to characterize...

Paper Review

stat.ML

cs.AI

2026-02-21•Paper Review

[논문 리뷰] Learning Personalized Agents from Human Feedback

Modern AI agents are powerful but often fail to align with the idiosyncratic, evolving preferences of individual users. Prior approaches typically rely on static datasets, either training implicit pre...

Paper Review

cs.AI

cs.CL

2026-02-21•Paper Review

[논문 리뷰] Scaling Beyond Masked Diffusion Language Models

Diffusion language models are a promising alternative to autoregressive models due to their potential for faster generation. Among discrete diffusion approaches, Masked diffusion currently dominates, ...

Paper Review

cs.LG

cs.CL

2026-02-21•Paper Review

[논문 리뷰] When Models Manipulate Manifolds: The Geometry of a Counting Task

Language models can perceive visual properties of text despite receiving only sequences of tokens-we mechanistically investigate how Claude 3.5 Haiku accomplishes one such task: linebreaking in fixed-...

Paper Review

cs.LG

2026-02-21•Paper Review

[논문 리뷰] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition

In this work, we investigate how a model's tendency to broadly integrate its parametric knowledge evolves throughout pretraining, and how this behavior affects overall performance, particularly in ter...

Paper Review

cs.CL

cs.AI

2026-02-20•Paper Review

[논문 리뷰] GLM-5: from Vibe Coding to Agentic Engineering

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of i...

Paper Review

cs.LG

cs.CL

2026-02-19•Paper Review

[논문 리뷰] PaperBanana: Automating Academic Illustration for AI Scientists

Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck in the research workflow. To lift this bu...

Paper Review

cs.CL

cs.CV

2026-02-18•Paper Review

[논문 리뷰] On-Policy Context Distillation for Language Models

Context distillation enables language models to internalize in-context knowledge into their parameters. In our work, we propose On-Policy Context Distillation (OPCD), a framework that bridges on-polic...

Paper Review

cs.CL

2026-02-18•Paper Review

[논문 리뷰] Evolutionary Router Feature Generation for Zero-Shot Graph Anomaly Detection with Mixture-of-Experts

Zero-shot graph anomaly detection (GAD) has attracted increasing attention recent years, yet the heterogeneity of graph structures, features, and anomaly patterns across graphs make existing single GN...

Paper Review

cs.IR

2026-02-18•Paper Review

[논문 리뷰] The Implicit Bias of Steepest Descent with Mini-batch Stochastic Gradient

A variety of widely used optimization methods like SignSGD and Muon can be interpreted as instances of steepest descent under different norm-induced geometries. In this work, we study the implicit bia...

Paper Review

cs.LG

2026-02-18•Paper Review

[논문 리뷰] RiemannGL: Riemannian Geometry Changes Graph Deep Learning

Graphs are ubiquitous, and learning on graphs has become a cornerstone in artificial intelligence and data mining communities. Unlike pixel grids in images or sequential structures in language, graphs...

Paper Review

cs.LG

cs.AI

2026-02-18•Paper Review

[논문 리뷰] Towards Autonomous Mathematics Research

Recent advances in foundational models have yielded reasoning systems capable of achieving a gold-medal standard at the International Mathematical Olympiad. The transition from competition-level probl...

Paper Review

cs.LG

cs.AI

2026-02-18•Paper Review

[논문 리뷰] Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

Spatial embodied intelligence requires agents to act to acquire information under partial observability. While multimodal foundation models excel at passive perception, their capacity for active, self...

Paper Review

cs.AI

cs.CL

2026-02-17•Paper Review

[논문 리뷰] HyperMLP: An Integrated Perspective for Sequence Modeling

Self-attention is often viewed as probabilistic query-key lookup, motivating designs that preserve normalized attention scores and fixed positional semantics. We advocate a simpler and more unified pe...

Paper Review

cs.LG

cs.AI

2026-02-17•Paper Review

[논문 리뷰] Think like a Scientist: Physics-guided LLM Agent for Equation Discovery

Explaining observed phenomena through symbolic, interpretable formulas is a fundamental goal of science. Recently, large language models (LLMs) have emerged as promising tools for symbolic equation di...

Paper Review

cs.AI

cs.LG

2026-02-17•Paper Review

[논문 리뷰] Latent Forcing: Reordering the Diffusion Trajectory for Pixel-Space Image Generation

Latent diffusion models excel at generating high-quality images but lose the benefits of end-to-end modeling. They discard information during image encoding, require a separately trained decoder, and ...

Paper Review

cs.CV

cs.LG

2026-02-17•Paper Review

[논문 리뷰] Causal-JEPA: Learning World Models through Object-Level Latent Interventions

World models require robust relational understanding to support prediction, reasoning, and control. While object-centric representations provide a useful abstraction, they are not sufficient to captur...

Paper Review

cs.AI

2026-02-17•Paper Review

[논문 리뷰] TabICLv2: A better, faster, scalable, and open tabular foundation model

Tabular foundation models, such as TabPFNv2 and TabICL, have recently dethroned gradient-boosted trees at the top of predictive benchmarks, demonstrating the value of in-context learning for tabular d...

Paper Review

cs.LG

2026-02-17•Paper Review

[논문 리뷰] VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

Pretraining Vision-Language-Action (VLA) policies on internet-scale video is appealing, yet current latent-action objectives often learn the wrong thing: they remain anchored to pixel variation rather...

AI와 데이터 과학의 미래: 최전선의 기술과 혁신

인공지능(AI)과 데이터 과학은 현대 기술 혁신을 주도하는 두 개의 거대한 축입니다. 이들은 우리의 일상생활을 넘어, 의료, 금융, 제조 등 산업 전반에 걸쳐 근본적인 변화를 일으키고 있습니다. AI는 인간의 지능을 모방하여 자율주행차를 운행하고, 질병을 진단하며, 개인에게 맞춤형 콘텐츠를 추천합니다. 데이터 과학은 이러한 AI 모델이 최상의 성능을 내도록...

2026-02-13•Paper Review

[논문 리뷰] Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

Recent advances in large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments. However, scaling such agent tr...

Paper Review

cs.AI

cs.CL

2026-02-13•Paper Review

[논문 리뷰] Deriving Neural Scaling Laws from the statistics of natural language

Despite the fact that experimental neural scaling laws have substantially guided empirical progress in large-scale machine learning, no existing theory can quantitatively predict the exponents of thes...

Paper Review

cs.LG

cs.AI

2026-02-13•Paper Review

[논문 리뷰] Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

World models have emerged as a critical frontier in AI research, aiming to enhance large models by infusing them with physical dynamics and world knowledge. The core objective is to enable agents to u...

Paper Review

cs.CV

2026-02-13•Paper Review

[논문 리뷰] Recursive Language Models

We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference paradigm...

자가 지도 학습의 발전: 데이터 효율적인 학습을 향한 여정

현대의 인공지능(AI) 기술은 대부분 대량의 레이블이 있는 데이터에 의존하여 모델을 학습시킵니다. 그러나 현실 세계의 데이터 대부분은 레이블이 없으며, 수동으로 레이블을 만드는 작업은 막대한 비용과 시간이 소요됩니다. 이러한 문제를 해결하기 위한 강력한 접근 방식인 자가 지도 학습(Self-Supervised Learning, SSL)은 최근 AI 연구의 ...

2026-02-11•Paper Review

[논문 리뷰] Reinforced Attention Learning

Post-training with Reinforcement Learning (RL) has substantially improved reasoning in Large Language Models (LLMs) via test-time scaling. However, extending this paradigm to Multimodal LLMs (MLLMs) t...

인간 행동 모델링의 새로운 패러다임: 사회적 상호작용에 대한 심층 학습 접근 방법

인간 행동 모델링은 인공지능(AI)과 데이터 과학 분야에서 가장 흥미롭고 도전적인 연구 주제 중 하나입니다. 특히 인간의 사회적 상호작용을 이해하고 예측하는 능력은 대화형 AI, 로보틱스, 정신 건강 케어 등 다양한 응용 분야에서 혁신을 가져올 큰 잠재력을 가지고 있습니다. 최근 Nature에 발표된 연구(https://www.nature.com/artic...

2026-02-10•Paper Review

[논문 리뷰] PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models

Relational Foundation Models (RFMs) facilitate data-driven decision-making by learning from complex multi-table databases. However, the diverse relational databases needed to train such models are rar...

Paper Review

cs.DB

cs.AI

2026-02-09•Paper Review

[논문 리뷰] Learning to Reason in 13 Parameters

Recent research has shown that language models can learn to extit{reason}, often via reinforcement learning. Some work even trains low-rank parameterizations for reasoning, but conventional LoRA can...

Paper Review

cs.LG

2026-02-05•Paper Review

[논문 리뷰] Titans: Learning to Memorize at Test Time

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memo...

Paper Review

cs.LG

cs.AI

2026-02-03•Paper Review

[논문 리뷰] Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning

Large language models have achieved near-expert performance in structured reasoning domains like mathematics and programming, yet their ability to perform compositional multi-hop reasoning in speciali...

Paper Review

cs.AI

cs.CL

2026-02-02•Paper Review

[논문 리뷰] Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation

In the realm of deep learning-based recommendation systems, the increasing computational demands, driven by the growing number of users and items, pose a significant challenge to practical deployment....

Paper Review

cs.IR

cs.LG

2026-02-01•Paper Review

[논문 리뷰] Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Can a model learn to escape its own learning plateau? Reinforcement learning methods for finetuning large reasoning models stall on datasets with low initial success rates, and thus little training si...

Paper Review

cs.LG

cs.CL

2026-02-01•Paper Review

[논문 리뷰] On First-Order Meta-Learning Algorithms

This paper considers meta-learning problems, where there is a distribution of tasks, and we would like to obtain an agent that performs well (i.e., learns quickly) when presented with a previously uns...

Paper Review

cs.LG

2026-02-01•Paper Review

[논문 리뷰] From Seed AI to Technological Singularity via Recursively Self-Improving Software

Software capable of improving itself has been a dream of computer scientists since the inception of the field. In this work we provide definitions for Recursively Self-Improving software, survey diffe...

Paper Review

cs.AI

2026-01-31•Paper Review

[논문 리뷰] Resonant Sparse Geometry Networks

We introduce Resonant Sparse Geometry Networks (RSGN), a brain-inspired architecture with self-organizing sparse hierarchical input-dependent connectivity. Unlike Transformer architectures that employ...

Paper Review

cs.LG

cs.AI

2026-01-30•Paper Review

[논문 리뷰] Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment

Pretraining corpora contain extensive discourse about AI systems, yet the causal influence of this discourse on downstream alignment remains poorly understood. If prevailing descriptions of AI behavio...

Paper Review

cs.CL

cs.AI

2026-01-29•Paper Review

[논문 리뷰] LLM-in-Sandbox Elicits General Agentic Intelligence

We introduce LLM-in-Sandbox, enabling LLMs to explore within a code sandbox (i.e., a virtual computer), to elicit general intelligence in non-code domains. We first demonstrate that strong LLMs, witho...

Paper Review

cs.CL

cs.AI

2026-01-28•Paper Review

[논문 리뷰] Large Language Model Agent: A Survey on Methodology, Applications and Challenges

The era of intelligent agents is upon us, driven by revolutionary advancements in large language models. Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabiliti...

Paper Review

cs.CL

2026-01-28•Paper Review

[논문 리뷰] Aligning Large Language Models to a Domain-specific Graph Database for NL2GQL

Graph Databases (Graph DB) find extensive application across diverse domains such as finance, social networks, and medicine. Yet, the translation of Natural Language (NL) into the Graph Query Language...

TabPFN: 데이터 과학의 새로운 패러다임

오늘날 데이터 과학과 인공지능(AI)은 다양한 산업과 학문 분야에서 혁신을 주도하고 있습니다. 이러한 변화 속에서 데이터 모델링과 예측은 매우 중요한 역할을 하고 있으며, 특히 테이블 형식의 데이터를 다루는 기술은 많은 주목을 받고 있습니다. 이번 글에서는 TabPFN이라는 혁신적인 접근 방식을 소개하고, 이 기술이 데이터 과학의 미래에 어떻게 기여할 수 ...

2026-01-25•Paper Review

[논문 리뷰] Agentic Reasoning for Large Language Models

Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in closed-world se...

데이터 과학 프로젝트의 성능을 높이는 방법: tuneTable

데이터 과학과 인공지능(AI) 프로젝트를 진행하다 보면 모델의 성능을 최적화하는 것이 가장 큰 도전 중 하나입니다. 어떤 알고리즘을 사용하든, 적절한 하이퍼파라미터(hyperparameter)를 찾는 과정은 모델의 성패를 좌우할 수 있습니다. 이 글에서는 그러한 최적화 문제를 해결하는 데 도움을 줄 수 있는 도구인 tuneTable에 대해 소개하고자 합니다...

2026-01-23•Paper Review

[논문 리뷰] Learning to Discover at Test Time

How can we use AI to discover a new state of the art for a scientific problem? Prior work in test-time scaling, such as AlphaEvolve, performs search by prompting a frozen LLM. We perform reinforcement...

Paper Review

cs.LG

cs.AI

2026-01-23•Paper Review

[논문 리뷰] STEM: Scaling Transformers with Embedding Modules

Fine-grained sparsity promises higher parametric capacity without proportional per-token compute, but often suffers from training instability, load balancing, and communication overhead. We introduce ...

Paper Review

cs.LG

2026-01-22•Paper Review

[논문 리뷰] Aligning Agentic World Models via Knowledgeable Experience Learning

Current Large Language Models (LLMs) exhibit a critical modal disconnect: they possess vast semantic knowledge but lack the procedural grounding to respect the immutable laws of the physical world. Co...

Paper Review

cs.CL

cs.AI

2026-01-22•Paper Review

[논문 리뷰] The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models

Large language models can represent a variety of personas but typically default to a helpful Assistant identity cultivated during post-training. We investigate the structure of the space of model pers...

Paper Review

cs.CL

2026-01-22•Paper Review

[논문 리뷰] Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Large language models often solve complex reasoning tasks more effectively with Chain-of-Thought (CoT), but at the cost of long, low-bandwidth token sequences. Humans, by contrast, often reason softly...

Paper Review

cs.CL

cs.AI

2026-01-22•Paper Review

[논문 리뷰] MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction

Universal multimodal embedding models have achieved great success in capturing semantic relevance between queries and candidates. However, current methods either condense queries and candidates into a...

Paper Review

cs.IR

cs.CL

2026-01-21•Paper Review

[논문 리뷰] SAGE: Tool-Augmented LLM Task Solving Strategies in Scalable Multi-Agent Environments

Large language models (LLMs) have proven to work well in question-answering scenarios, but real-world applications often require access to tools for live information or actuation. For this, LLMs can b...

Paper Review

cs.SE

cs.AI

2026-01-19•Paper Review

[논문 리뷰] Urban Socio-Semantic Segmentation with Vision-Language Reasoning

As hubs of human activity, urban surfaces consist of a wealth of semantic entities. Segmenting these various entities from satellite imagery is crucial for a range of downstream applications. Current ...

Paper Review

cs.CV

cs.AI

2026-01-18•Paper Review

[논문 리뷰] KGGen: Extracting Knowledge Graphs from Plain Text with Language Models

Recent interest in building foundation models for KGs has highlighted a fundamental challenge: knowledge-graph data is relatively scarce. The best-known KGs are primarily human-labeled, created by pat...

Paper Review

cs.CL

cs.AI

2026-01-15•Paper Review

[논문 리뷰] Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents

Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows, making effective memory management critical. Existing methods typically handle l...

Paper Review

cs.CL

2026-01-13•Paper Review

[논문 리뷰] Tracing Moral Foundations in Large Language Models

Large language models (LLMs) often produce human-like moral judgments, but it is unclear whether this reflects an internal conceptual structure or superficial ``moral mimicry.'' Using Moral Foundation...

Paper Review

cs.CL

cs.AI

2026-01-13•Paper Review

[논문 리뷰] From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence

Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the le...

Paper Review

cs.LG

stat.ML

2026-01-12•Paper Review

[논문 리뷰] Token-Level LLM Collaboration via FusionRoute

Large language models (LLMs) exhibit strengths across diverse domains. However, achieving strong performance across these domains with a single general-purpose model typically requires scaling to size...

Paper Review

cs.AI

cs.CL

2026-01-11•Paper Review

[논문 리뷰] GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

As language models become increasingly capable, users expect them to provide not only accurate responses but also behaviors aligned with diverse human preferences across a variety of scenarios. To ach...

Paper Review

cs.CL

cs.AI

2026-01-11•Paper Review

[논문 리뷰] Learning Latent Action World Models In The Wild

Agents capable of reasoning and planning in the real world require the ability of predicting the consequences of their actions. While world models possess this capability, they most often require acti...

Paper Review

cs.AI

cs.CV

2026-01-11•Paper Review

[논문 리뷰] Jenius Agent: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios

As agent systems powered by large language models (LLMs) advance, improving the task performance of an autonomous agent, especially in context understanding, tool usage, and response generation, has b...

Paper Review

cs.AI

2026-01-11•Paper Review

[논문 리뷰] End-to-End Test-Time Training for Long Context

We formulate long-context language modeling as a problem in continual learning rather than architecture design. Under this formulation, we only use a standard architecture -- a Transformer with slidin...

Paper Review

cs.LG

2026-01-11•Paper Review

[논문 리뷰] The Missing Layer of AGI: From Pattern Alchemy to Coordination Physics

Influential critiques argue that Large Language Models (LLMs) are a dead end for AGI: "mere pattern matchers" structurally incapable of reasoning or planning. We argue this conclusion misidentifies th...

Paper Review

cs.AI

cs.LG

2026-01-10•Paper Review

[논문 리뷰] Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases

Real-world software engineering tasks require coding agents that can operate over massive repositories, sustain long-horizon sessions, and reliably coordinate complex toolchains at test time. Existing...

Paper Review

cs.CL

cs.AI

2026-01-09•Paper Review

[논문 리뷰] RelayLLM: Efficient Reasoning via Collaborative Decoding

Large Language Models (LLMs) for complex reasoning is often hindered by high computational costs and latency, while resource-efficient Small Language Models (SLMs) typically lack the necessary reasoni...

Paper Review

cs.CL

cs.AI

2026-01-09•Paper Review

[논문 리뷰] AT²PO: Agentic Turn-based Policy Optimization via Tree Search

LLM agents have emerged as powerful systems for tackling multi-turn tasks by interleaving internal reasoning and external tool interactions. Agentic Reinforcement Learning has recently drawn significa...

Paper Review

cs.AI

cs.CL

2026-01-09•Paper Review

[논문 리뷰] 1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities

Scaling up self-supervised learning has driven breakthroughs in language and vision, yet comparable progress has remained elusive in reinforcement learning (RL). In this paper, we study building block...

Paper Review

cs.LG

cs.AI

2026-01-08•Paper Review

[논문 리뷰] Extracting books from production language models

Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model's weights during training, and whether those memorized dat...

Paper Review

cs.CL

cs.AI

2026-01-06•Paper Review

[논문 리뷰] The Platonic Representation Hypothesis

We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways...

Paper Review

cs.LG

cs.AI

2026-01-04•Paper Review

[논문 리뷰] Deep Delta Learning

The efficacy of deep residual networks is fundamentally predicated on the identity shortcut connection. While this mechanism effectively mitigates the vanishing gradient problem, it imposes a strictly...

Paper Review

2026-01-04•Paper Review

[논문 리뷰] SeedFold: Scaling Biomolecular Structure Prediction

Highly accurate biomolecular structure prediction is a key component of developing biomolecular foundation models, and one of the most critical aspects of building foundation models is identifying the...

Paper Review

q-bio.BM

2026-01-04•Paper Review

[논문 리뷰] CogRec: A Cognitive Recommender Agent Fusing Large Language Models and Soar for Explainable Recommendation

Large Language Models (LLMs) have demonstrated a remarkable capacity in understanding user preferences for recommendation systems. However, they are constrained by several critical challenges, includi...

Paper Review

cs.AI

cs.IR

2026-01-04•Paper Review

[논문 리뷰] Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Multi-step retrieval-augmented generation (RAG) has become a widely adopted strategy for enhancing large language models (LLMs) on tasks that demand global comprehension and intensive reasoning. Many ...

Paper Review

cs.CL

cs.AI

2026-01-04•Paper Review

[논문 리뷰] Training AI Co-Scientists Using Rubric Rewards

AI co-scientists are emerging as a tool to assist human researchers in achieving their research goals. A crucial feature of these AI co-scientists is the ability to generate a research plan given a se...

Paper Review

cs.LG

cs.CL

2026-01-04•Paper Review

[논문 리뷰] SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search

Large Language Models (LLMs) often falter at complex planning tasks that require exploration and self-correction, as their linear reasoning process struggles to recover from early mistakes. While sear...

Paper Review

cs.AI

cs.LG

2026-01-04•Paper Review

[논문 리뷰] Attention Is Not What You Need

We revisit a basic question in sequence modeling: is explicit self-attention actually necessary for strong performance and reasoning? We argue that standard multi-head attention is best seen as a form...

Paper Review

cs.LG

cs.AI

2026-01-04•Paper Review

[논문 리뷰] Scaling and context steer LLMs along the same computational path as the human brain

Recent studies suggest that the representations learned by large language models (LLMs) are partially aligned to those of the human brain. However, whether and why this alignment score arises from a s...

Paper Review

cs.LG

q-bio.NC

2026-01-04•Paper Review

[논문 리뷰] Zero-Overhead Introspection for Adaptive Test-Time Compute

Large language models excel at reasoning but lack key aspects of introspection, including anticipating their own success and the computation required to achieve it. Humans use real-time introspection ...

Paper Review

cs.LG

cs.AI

2026-01-04•Paper Review

[논문 리뷰] A Survey on Large Language Models for Mathematical Reasoning

Mathematical reasoning has long represented one of the most fundamental and challenging frontiers in artificial intelligence research. In recent years, large language models (LLMs) have achieved signi...

Paper Review

cs.AI

cs.CL

2026-01-03•Paper Review

[논문 리뷰] Aligning machine and human visual representations across abstraction levels

...

Paper Review

2026-01-03•Paper Review

[논문 리뷰] Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Hyperparameter tuning can dramatically impact training stability and final performance of large-scale models. Recent works on neural network parameterisations, such as $μ$P, have enabled transfer of o...

Paper Review

cs.LG

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

We present the surprising finding that a language model's reasoning capabilities can be improved by training on synthetic datasets of chain-of-thought (CoT) traces from more capable models, even when ...

Paper Review

cs.AI

cs.LG

2026-01-03•Paper Review

[논문 리뷰] SemanticGen: Video Generation in Semantic Space

State-of-the-art video generative models typically learn the distribution of video latents in the VAE space and map them to pixels using a VAE decoder. While this approach can generate high-quality vi...

Paper Review

cs.CV

2026-01-03•Paper Review

[논문 리뷰] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our stud...

Paper Review

cs.CV

2026-01-03•Paper Review

[논문 리뷰] Epistemological Fault Lines Between Human and Artificial Intelligence

Large language models (LLMs) are widely described as artificial intelligence, yet their epistemic profile diverges sharply from human cognition. Here we show that the apparent alignment between human ...

Paper Review

cs.CY

cs.CL

2026-01-03•Paper Review

[논문 리뷰] Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement

We present MACLA, a framework that decouples reasoning from learning by maintaining a frozen large language model while performing all adaptation in an external hierarchical procedural memory. MACLA e...

Paper Review

cs.LG

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Toward Training Superintelligent Software Agents through Self-Play SWE-RL

While current software agents powered by large language models (LLMs) and agentic reinforcement learning (RL) can boost programmer productivity, their training data (e.g., GitHub issues and pull reque...

Paper Review

cs.SE

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Sophia: A Persistent Agent Framework of Artificial Life

The development of LLMs has elevated AI agents from task-specific tools to long-lived, decision-making entities. Yet, most architectures remain static and reactive, tethered to manually defined, narro...

Paper Review

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Distributional AGI Safety

AI safety and alignment research has predominantly been focused on methods for safeguarding individual AI systems, resting on the assumption of an eventual emergence of a monolithic Artificial General...

Paper Review

cs.AI

2026-01-03•Paper Review

[논문 리뷰] LLaDA2.0: Scaling Up Diffusion Language Models to 100B

This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establi...

Paper Review

cs.LG

cs.AI

2026-01-03•Paper Review

[논문 리뷰] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Autoregressive models (ARMs) are hindered by slow sequential inference. While masked diffusion models (MDMs) offer a parallel alternative, they suffer from critical drawbacks: high computational overh...

Paper Review

cs.CL

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint aud...

Paper Review

cs.CV

2026-01-03•Paper Review

[논문 리뷰] An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges

Vision-Language-Action (VLA) models are driving a revolution in robotics, enabling machines to understand instructions and interact with the physical world. This field is exploding with new models and...

Paper Review

cs.RO

2026-01-03•Paper Review

[논문 리뷰] Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

The landscape of high-performance image generation models is currently dominated by proprietary systems, such as Nano Banana Pro and Seedream 4.0. Leading open-source alternatives, including Qwen-Imag...

Paper Review

cs.CV

2026-01-03•Paper Review

[논문 리뷰] ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensiv...

Paper Review

cs.CL

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework

Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy-sensitive. Many such generation tasks require coordinate...

Paper Review

cs.CL

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Flow Map Distillation Without Data

State-of-the-art flow models achieve remarkable quality but require slow, iterative sampling. To accelerate this, flow maps can be distilled from pre-trained teachers, a procedure that conventionally ...

Paper Review

cs.LG

cs.CV

2026-01-03•Paper Review

[논문 리뷰] Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Vision-Language Models (VLMs) excel at reasoning in linguistic space but struggle with perceptual understanding that requires dense visual perception, e.g., spatial reasoning and geometric awareness. ...

Paper Review

cs.CV

cs.AI

2026-01-03•Paper Review

[논문 리뷰] CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning

Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge but still suffers from long contexts and disjoint retrieval-generation optimization. In this work, we...

Paper Review

cs.CL

2026-01-03•Paper Review

[논문 리뷰] OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists

With the rapid development of Large Language Models (LLMs), AI agents have demonstrated increasing proficiency in scientific tasks, ranging from hypothesis generation and experimental design to manusc...

Paper Review

cs.CY

cs.CE

2026-01-03•Paper Review

[논문 리뷰] Evolution Strategies at the Hyperscale

We introduce Evolution Guided General Optimization via Low-rank Learning (EGGROLL), an evolution strategies (ES) algorithm designed to scale backprop-free optimization to large population sizes for mo...

Paper Review

cs.LG

cs.AI

2026-01-03•Paper Review

[논문 리뷰] A Primer on Quantum Machine Learning

Quantum machine learning (QML) is a computational paradigm that seeks to apply quantum-mechanical resources to solve learning problems. As such, the goal of this framework is to leverage quantum proce...

Paper Review

quant-ph

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Fine-Tuned LLMs Know They Don't Know: A Parameter-Efficient Approach to Recovering Honesty

The honesty of Large Language Models (LLMs) is increasingly important for safe deployment in high-stakes domains. However, this crucial trait is severely undermined by supervised fine-tuning (SFT), a ...

Paper Review

cs.CL

2026-01-03•Paper Review

[논문 리뷰] Black-Box On-Policy Distillation of Large Language Models

Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work...

Paper Review

cs.CL

cs.AI

2026-01-03•Paper Review

[논문 리뷰] DoPE: Denoising Rotary Position Embedding

Rotary Position Embedding (RoPE) in Transformer models has inherent limits that weaken length extrapolation. We reinterpret the attention map with positional encoding as a noisy feature map, and propo...

Paper Review

cs.CL

2026-01-03•Paper Review

[논문 리뷰] Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Improving reasoning capabilities of Large Language Models (LLMs), especially under parameter constraints, is crucial for real-world applications. Prior work proposes recurrent transformers, which allo...

Paper Review

cs.CL

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Challenging the prevailing consensus that small models inherently lack robust reasoning, this report introduces VibeThinker-1.5B, a 1.5B-parameter dense model developed via our Spectrum-to-Signal Prin...

Paper Review

cs.AI

cs.CL

2026-01-03•Paper Review

[논문 리뷰] World Simulation with Video Foundation Models for Physical AI

We introduce [Cosmos-Predict2.5], the latest generation of the Cosmos World Foundation Models for Physical AI. Built on a flow-based architecture, [Cosmos-Predict2.5] unifies Text2World, Image2World, ...

Paper Review

cs.CV

cs.AI

2026-01-03•Paper Review

[논문 리뷰] The Era of Agentic Organization: Learning to Organize with Language Models

We envision a new era of AI, termed agentic organization, where agents solve complex problems by working collaboratively and concurrently, enabling outcomes beyond individual intelligence. To realize ...

Paper Review

cs.AI

cs.CL

2026-01-03•Paper Review

[논문 리뷰] Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision

Where do learning signals come from when there is no ground truth in post-training? We propose turning exploration into supervision through Compute as Teacher (CaT), which converts the model's own exp...

Paper Review

cs.LG

2026-01-03•Paper Review

[논문 리뷰] Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings

The attention mechanism in a Transformer architecture matches key to query based on both content -- the what -- and position in a sequence -- the where. We present an analysis indicating that what and...

Paper Review

cs.LG

cs.AI

2026-01-03•Paper Review

[논문 리뷰] Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Reinforcement Learning (RL) has proven highly effective at enhancing the complex reasoning abilities of Large Language Models (LLMs), yet underlying mechanisms driving this success remain largely opaq...

Paper Review

cs.AI

cs.CL

2026-01-03•Paper Review

[논문 리뷰] From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

Artificial intelligence (AI) is reshaping scientific discovery, evolving from specialized computational tools into autonomous research partners. We position Agentic Science as a pivotal stage within t...

Paper Review

cs.LG

2026-01-03•Paper Review

[논문 리뷰] Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

We present AlphaGeometry2 (AG2), a significantly improved version of AlphaGeometry introduced in (Trinh et al., 2024), which has now surpassed an average gold medalist in solving Olympiad geometry pro...

Paper Review

cs.AI

cs.LG

2026-01-02•Paper Review

[논문 리뷰] Nested Learning: The Illusion of Deep Learning Architecture

Over the last decades, developing more powerful neural architectures and simultaneously designing optimization algorithms to effectively train them have been the core of research efforts to enhance th...

Paper Review

2026-01-02•Paper Review

[논문 리뷰] Act2Goal: From World Model To General Goal-conditioned Policy

Specifying robotic manipulation tasks in a manner that is both expressive and precise remains a central challenge. While visual goals provide a compact and unambiguous task specification, existing goa...

Paper Review

cs.RO

cs.AI

2026-01-02•Paper Review

[논문 리뷰] Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

Neural network pruning is widely used to reduce model size and computational cost. Yet, most existing methods treat sparsity as an externally imposed constraint, enforced through heuristic importance ...

Paper Review

cs.AI

2026-01-02•Paper Review

[논문 리뷰] AgentEvolver: Towards Efficient Self-Evolving Agent System

Autonomous agents powered by large language models (LLMs) have the potential to significantly enhance human productivity by reasoning, using tools, and executing complex tasks in diverse environments....

Paper Review

cs.LG

cs.AI

2026-01-02•Paper Review

[논문 리뷰] TiDAR: Think in Diffusion, Talk in Autoregression

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language model...

Paper Review

cs.CL

cs.AI

2026-01-02•Paper Review

[논문 리뷰] AlphaResearch: Accelerating New Algorithm Discovery with Language Models

Large language models have made significant progress in complex but easy-to-verify problems, yet they still struggle with discovering the unknown. In this paper, we present extbf{AlphaResearch}, an ...

Paper Review

cs.CL

2026-01-02•Paper Review

[논문 리뷰] Attention and Compression is all you need for Controllably Efficient Language Models

The quadratic cost of attention in transformers motivated the development of efficient approaches: namely sparse and sliding window attention, convolutions and linear attention. Although these approac...

Paper Review

cs.LG

2026-01-02•Paper Review

[논문 리뷰] The Curved Spacetime of Transformer Architectures

We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity. Queries and keys induce an effective metric on representation s...

Paper Review

cs.LG

cs.CL

2026-01-02•Paper Review

[논문 리뷰] No-Human in the Loop: Agentic Evaluation at Scale for Recommendation

Evaluating large language models (LLMs) as judges is increasingly critical for building scalable and trustworthy evaluation pipelines. We present ScalingEval, a large-scale benchmarking study that sys...

Paper Review

cs.AI

cs.IR

2026-01-02•Paper Review

[논문 리뷰] ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Multimodal reasoning requires iterative coordination between language and vision, yet it remains unclear what constitutes a meaningful interleaved chain of thought. We posit that text and image though...

Paper Review

cs.CV

2026-01-02•Paper Review

[논문 리뷰] Context Engineering 2.0: The Context of Context Engineering

Karl Marx once wrote that ``the human essence is the ensemble of social relations'', suggesting that individuals are not isolated entities but are fundamentally shaped by their interactions with other...

Paper Review

cs.AI

cs.CL

2026-01-02•Paper Review

[논문 리뷰] Chain-of-Thought Hijacking

Large reasoning models (LRMs) achieve higher task performance with more inference-time computation, and prior works suggest this scaled reasoning may also strengthen safety by improving refusal. Yet w...

Paper Review

cs.AI

2026-01-02•Paper Review

[논문 리뷰] GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning

Autonomous agents powered by large language models (LLMs) have shown impressive capabilities in tool manipulation for complex task-solving. However, existing paradigms such as ReAct rely on sequential...

Paper Review

cs.AI

cs.CL

2026-01-02•Paper Review

[논문 리뷰] Reasoning with Sampling: Your Base Model is Smarter Than You Think

Frontier reasoning models have exhibited incredible capabilities across a wide array of disciplines, driven by posttraining large language models (LLMs) with reinforcement learning (RL). However, desp...

Paper Review

cs.LG

cs.AI

2026-01-02•Paper Review

[논문 리뷰] Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Multi-LLM systems harness the complementary strengths of diverse Large Language Models, achieving performance and efficiency gains unattainable by a single model. In existing designs, LLMs communicate...

Paper Review

cs.CL

cs.LG

2026-01-02•Paper Review

[논문 리뷰] Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

Large language models (LLMs) have shown remarkable advancements in enabling language agents to tackle simple tasks. However, applying them for complex, multi-step, long-horizon tasks remains a challen...

Paper Review

cs.CL

2026-01-01•General

테스트의 중요성과 구현 방법

소프트웨어 개발 과정에서 "테스트"라는 단어를 듣지 않은 개발자는 아마 없을 것입니다. 그만큼 테스트는 소프트웨어의 신뢰성과 안정성을 보장하는 데 필수적인 역할을 합니다. 특히 데이터 과학과 인공지능 분야에서는 결과의 정확성과 모델의 성능을 검증하기 위해 테스트가 더욱 중요합니다. 이번 포스트에서는 테스트의 중요성을 살펴보고, 테스트를 어떻게 효율적으로 구...

2026-01-01•Paper Review

[논문 리뷰] mHC: Manifold-Constrained Hyper-Connections

Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifyi...

Paper Review

cs.CL

cs.AI

2026-01-01•Paper Review

[논문 리뷰] Real Deep Research for AI, Robotics and Beyond

With the rapid growth of research in AI and robotics now producing over 10,000 papers annually it has become increasingly difficult for researchers to stay up to date. Fast evolving trends, the rise o...

Paper Review

cs.AI

cs.CL

2026-01-01•Paper Review

[논문 리뷰] The Free Transformer

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experiment...

Paper Review

cs.LG

2026-01-01•Paper Review

[논문 리뷰] Training-Free Group Relative Policy Optimization

Recent advances in Large Language Model (LLM) agents have demonstrated their promising general capabilities. However, their performance in specialized real-world domains often degrades due to challeng...

Paper Review

cs.CL

2026-01-01•Paper Review

[논문 리뷰] ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

With the growing adoption of large language model agents in persistent real-world roles, they naturally encounter continuous streams of tasks. A key limitation, however, is their failure to learn from...

Paper Review

cs.AI

cs.CL

2026-01-01•Paper Review

[논문 리뷰] Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

While Transformers have been the main architecture behind deep learning's success in language modeling, state-space models (SSMs) such as Mamba have recently been shown to match or outperform Transfor...

Paper Review

cs.LG

2026-01-01•Paper Review

[논문 리뷰] Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time a...

Paper Review

cs.LG

cs.AI

2026-01-01•Paper Review

[논문 리뷰] Vector database management systems: Fundamental concepts, use-cases, and current challenges

Vector database management systems have emerged as an important component in modern data management, driven by the growing importance for the need to computationally describe rich data such as texts, ...

Paper Review

cs.DB

2026-01-01•General

데이터 증강을 통한 모델 성능 향상 기법

인공지능(AI)와 머신러닝(ML) 분야에서 데이터는 가장 중요한 자산입니다. 충분한 양의 고품질 데이터를 확보하는 것은 모델의 성능을 결정짓는 중요한 요소입니다. 그러나 현실에서는 데이터가 부족하거나, 데이터 수집에 많은 비용과 시간이 소요되는 경우가 자주 발생합니다. 이러한 문제를 해결하기 위해 데이터 증강(Data Augmentation) 기법이 주목받...

2025-12-31•NLP

RAG 시스템 구축: 검색 증강 생성의 원리와 구현

오늘날 인공지능(AI)이 발전하면서 자연어 처리(NLP) 분야에서도 다양한 혁신이 일어나고 있습니다. 그 중 하나가 바로 RAG(Retrieval-Augmented Generation, 검색 증강 생성) 시스템입니다. RAG는 검색과 생성 두 가지 프로세스를 결합하여 보다 정확하고 풍부한 정보를 제공하는 데 중점을 둡니다. 전통적인 NLP 모델이 대규모 데...

2025-12-31•Deep Learning

PyTorch 텐서 기초

딥러닝(Deep Learning)은 현대 인공지능(AI) 기술의 중심에 서 있습니다. 이 중에서도 PyTorch는 연구자와 개발자들 사이에서 인기가 높은 프레임워크로, 그 유연성과 직관적인 인터페이스 덕분에 널리 사용되고 있습니다. PyTorch를 제대로 이해하기 위해서는 텐서(Tensor)라는 기본 단위를 확실히 이해하는 것이 중요합니다. 텐서는 데이터를...

2025-12-31•Paper Review

[논문 리뷰] QLoRA: Efficient Finetuning of Quantized LLMs

We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLo...

Paper Review

cs.LG

2025-12-31•Paper Review

[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed t...

Paper Review

cs.CL

2025-12-31•MLOps

MLOps 입문: 머신러닝 운영 파이프라인 구축

머신러닝(ML)은 오늘날 많은 산업에서 혁신을 주도하고 있습니다. 그러나 머신러닝 모델을 성공적으로 개발하는 것만으로는 충분하지 않습니다. 모델을 실제 운영 환경에 배포하고 모니터링하며 지속적으로 개선하기 위해서는 특별한 노력이 필요합니다. 이 과정에서 MLOps가 중요한 역할을 합니다....

2025-12-31•NLP

LLM 파인튜닝: LoRA와 QLoRA 기법

대규모 언어 모델(LLM, Large Language Models)은 자연어 처리(NLP, Natural Language Processing) 분야에서 혁신을 이끌어왔습니다. 이러한 모델들은 방대한 양의 데이터를 활용하여 다양한 언어적 과제를 수행할 수 있습니다. 하지만 LLM은 막대한 컴퓨팅 자원을 요구하며, 특정 작업에 맞게 모델을 조정하는 파인튜닝 과...

2025-12-31•Deep Learning

Diffusion 모델: 이미지 생성 AI의 원리와 활용

인공지능(AI) 기술은 최근 몇 년 동안 급격히 발전해 왔으며, 특히 이미지 생성 분야에서 큰 주목을 받고 있습니다. 이러한 발전의 중심에는 'Diffusion 모델'이라는 강력한 기술이 자리잡고 있습니다. Diffusion 모델은 복잡한 패턴을 학습하고, 현실감 넘치는 이미지를 생성하는 데 탁월한 성능을 보이며, 다양한 산업 분야에서 응용되고 있습니다....

2025-12-31•Deep Learning

딥러닝 기초

딥러닝(Deep Learning)은 인공지능(AI) 분야에서 가장 주목받는 기술 중 하나로, 여러 분야에서 혁신적인 변화를 이끌고 있습니다. 이미지를 인식하거나 자연어를 처리하는 것과 같은 복잡한 문제를 해결하는 데 있어 딥러닝의 역할은 날이 갈수록 커지고 있습니다. 이 글에서는 딥러닝의 기본 개념을 이해하고, 실제로 어떻게 구현할 수 있는지를 살펴보겠습니...

2025-12-30•NLP

자연어처리에서의 Word Embedding

자연어처리(NLP, Natural Language Processing) 분야에서 Word Embedding은 필수적인 개념 중 하나입니다. 이는 컴퓨터가 인간의 언어를 이해하고 처리할 수 있도록 돕는 중요한 기술입니다. 이번 블로그 포스트에서는 Word Embedding의 기본 개념과 주요 기법들을 살펴보고, Python 코드를 통해 이를 실습해보는 시간을...

2025-12-30•Deep Learning

트랜스포머 Attention 메커니즘의 이해

최근 몇 년간 자연어 처리(Natural Language Processing, NLP) 분야에서는 혁신적인 변화가 일어났습니다. 그 중심에는 단연 트랜스포머(Transformer) 모델이 자리 잡고 있습니다. 트랜스포머는 다양한 NLP 작업에서 뛰어난 성능을 보이며, 언어 모델, 번역, 요약, 질의응답 등 여러 응용 분야에서 사용됩니다. 이러한 트랜스포머의...

Transformer

Attention

Deep Learning

2025-12-30•Machine Learning

강화학습 완벽 가이드: 이론부터 실전까지

강화학습(Reinforcement Learning)은 인공지능(AI) 분야에서 기계가 스스로 학습하고 결정할 수 있는 능력을 부여하는 중요한 기술입니다. 이 방법론은 로봇 제어, 게임 플레이, 자율 주행 자동차 등 다양한 분야에서 혁신적인 결과를 보여주고 있습니다. 본 가이드에서는 강화학습의 역사, 이론적 배경, 주요 알고리즘, 그리고 실제 구현까지 심층적으로 다룹니다.

Reinforcement Learning

Q-Learning

DQN

2025-12-30•Paper Review

[논문 리뷰] Attention Is All You Need

Transformer 아키텍처를 최초로 제안한 획기적인 논문. RNN과 CNN을 완전히 배제하고 Attention 메커니즘만으로 시퀀스 모델링의 새로운 패러다임을 제시하여, BERT, GPT 등 현대 자연어처리의 기반을 마련했습니다.

Paper Review

Transformer

Attention

2025-12-30•Deep Learning

Graph Neural Networks 기초

최근 들어 인공지능(AI)과 머신러닝(ML)이 다양한 분야에서 혁신을 이루고 있습니다. 그중에서도 그래프 뉴럴 네트워크(Graph Neural Networks, GNN)는 복잡한 구조적 데이터를 효율적으로 처리할 수 있는 강력한 도구로 주목받고 있습니다. 그래프 데이터는 소셜 네트워크, 추천 시스템, 분자 구조 등 다양한 실세계 문제에서 자연스럽게 발생하며...

2025-12-30•Deep Learning

CNN 이미지 분류: 딥러닝의 핵심 기술

이미지 분류는 컴퓨터 비전 분야에서 가장 중요한 문제 중 하나로, 자율주행차의 장애물 인식, 의료 영상 분석, 소셜 미디어의 이미지 태그링 등 다양한 분야에 응용되고 있습니다. 특히, CNN(Convolutional Neural Networks, 합성곱 신경망)은 이미지 데이터 처리에서 탁월한 성능을 보여주며, 딥러닝 혁신의 중심에 서 있습니다. 이번 블로...

BERT 완벽 가이드: 자연어처리의 혁명적 모델

BERT(Bidirectional Encoder Representations from Transformers)는 2018년 구글이 발표한 혁신적인 자연어처리 모델입니다. 양방향 문맥 이해를 통해 NLP 분야에 새로운 패러다임을 제시한 BERT의 아키텍처, 학습 방법, 실전 활용법까지 상세히 알아봅니다.

SuanLab 블로그에 오신 것을 환영합니다

데이터 과학, 인공지능, 딥러닝에 관한 다양한 이야기를 공유하는 SuanLab 블로그입니다.

Welcome

SuanLab

Blog