본문으로 건너뛰기

Blog

데이터 과학, 인공지능, 딥러닝에 관한 이야기

202개 중 1-12번째 포스트

[논문 리뷰] LLM2Vec-Gen: Generative Embeddings from Large Language Models
2026-03-1510Paper Review

[논문 리뷰] LLM2Vec-Gen: Generative Embeddings from Large Language Models

LLM-based text embedders typically encode the semantic content of their input. However, embedding tasks require mapping diverse inputs to similar outputs. Typically, this input-output is addressed by ...

Paper Review
cs.CL
cs.CL
[논문 리뷰] OpenClaw-RL: Train Any Agent Simply by Talking
2026-03-159Paper Review

[논문 리뷰] OpenClaw-RL: Train Any Agent Simply by Talking

Every agent interaction generates a next-state signal, namely the user reply, tool output, terminal or GUI state change that follows each action, yet no existing agentic RL system recovers it as a liv...

Paper Review
cs.CL
cs.CL
[논문 리뷰] LLM2Vec-Gen: Generative Embeddings from Large Language Models
2026-03-148Paper Review

[논문 리뷰] LLM2Vec-Gen: Generative Embeddings from Large Language Models

LLM-based text embedders typically encode the semantic content of their input. However, embedding tasks require mapping diverse inputs to similar outputs. Typically, this input-output is addressed by ...

Paper Review
cs.CL
cs.CL
[논문 리뷰] Nurture-First Agent Development: Building Domain-Expert AI Agents Through Conversational Knowledge Crystallization
2026-03-128Paper Review

[논문 리뷰] Nurture-First Agent Development: Building Domain-Expert AI Agents Through Conversational Knowledge Crystallization

The emergence of large language model (LLM)-based agent frameworks has shifted the primary challenge in building domain-expert AI agents from raw capability to effective encoding of domain expertise. ...

Paper Review
cs.AI
cs.HC
+1
[논문 리뷰] Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
2026-03-129Paper Review

[논문 리뷰] Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)

Language models (LMs) often struggle to generate diverse, human-like creative content, raising concerns about the long-term homogenization of human thought through repeated exposure to similar outputs...

Paper Review
cs.CL
cs.CL
[논문 리뷰] Cache Mechanism for Agent RAG Systems
2026-03-108Paper Review

[논문 리뷰] Cache Mechanism for Agent RAG Systems

Recent advances in Large Language Model (LLM)-based agents have been propelled by Retrieval-Augmented Generation (RAG), which grants the models access to vast external knowledge bases. Despite RAG's s...

Paper Review
cs.CL
cs.CL
[논문 리뷰] Why Is Anything Conscious?
2026-03-109Paper Review

[논문 리뷰] Why Is Anything Conscious?

We tackle the problem of consciousness by taking the naturally selected, embodied organism as our starting point. We provide a formalism describing how biological systems such as human bodies self-org...

Paper Review
cs.AI
cs.AI
[논문 리뷰] RealWonder: Real-Time Physical Action-Conditioned Video Generation
2026-03-089Paper Review

[논문 리뷰] RealWonder: Real-Time Physical Action-Conditioned Video Generation

Current video generation models cannot simulate physical consequences of 3D actions like forces and robotic manipulations, as they lack structural understanding of how actions affect 3D scenes. We pre...

Paper Review
cs.CV
cs.AI
+1
[논문 리뷰] Helios: Real Real-Time Long Video Generation Model
2026-03-088Paper Review

[논문 리뷰] Helios: Real Real-Time Long Video Generation Model

We introduce Helios, the first 14B video generation model that runs at 19.5 FPS on a single NVIDIA H100 GPU and supports minute-scale generation while matching the quality of a strong baseline. We mak...

Paper Review
cs.CV
cs.CV
[논문 리뷰] Phi-4-reasoning-vision-15B Technical Report
2026-03-089Paper Review

[논문 리뷰] Phi-4-reasoning-vision-15B Technical Report

We present Phi-4-reasoning-vision-15B, a compact open-weight multimodal reasoning model, and share the motivations, design choices, experiments, and learnings that informed its development. Our goal i...

Paper Review
cs.AI
cs.CV
+1
[논문 리뷰] Beyond Language Modeling: An Exploration of Multimodal Pretraining
2026-03-0817Paper Review

[논문 리뷰] Beyond Language Modeling: An Exploration of Multimodal Pretraining

The visual world offers a critical axis for advancing foundation models beyond language. Despite growing interest in this direction, the design space for native multimodal models remains opaque. We pr...

Paper Review
cs.CV
cs.CV
[논문 리뷰] Chain of World: World Model Thinking in Latent Motion
2026-03-089Paper Review

[논문 리뷰] Chain of World: World Model Thinking in Latent Motion

Vision-Language-Action (VLA) models are a promising path toward embodied intelligence, yet they often overlook the predictive and temporal-causal structure underlying visual dynamics. World-model VLAs...

Paper Review
cs.CV
cs.AI
+1
...