前段时间Mem0莫名大火被冠以各种称号“超越RAG”、“下一代RAG”、“让LLM拥有超强个性记忆”等等,截止到今天已收获19k Star,可见其火热程度。PaperAgent也专门撰文对其code进行了分析(核心Prompt):

看完Mem0的源码,它很火,我很懵。

Mem0唱罢,Memary登场!_深度学习

然而,在Mem0热度褪去之后,会不会出现下一个Mem0?

Mem0唱罢,Memary登场!_人工智能_02

Mem0唱罢,Memary登场!_人工智能_03

Mem0唱罢,Memary登场!_人工智能_02

,在此,PaperAgent收集了一些赋能LLM或Agent或App的开源Memory项目:supermemory、redcache-ai、Memary。

Mem0唱罢,Memary登场!_搜索_05

Mem0唱罢,Memary登场!_搜索_06

Mem0唱罢,Memary登场!_人工智能_07

其中Memary宣称会跟踪用户的偏好,模拟人类记忆,让Agent能够随着时间的推移学习和改进,基于Neo4j图形数据库来存储知识,使用Llama Index进行知识注入

Memary的内存结构

Mem0唱罢,Memary登场!_github_08

Agent

为了向没有现有Agent的开发人员提供访问权限,设置了一个简单的Agent实现,基于给定所提供的工具,使用ReAct Agent来规划并执行查询:

  • 搜索工具对于从知识图谱检索信息至关重要。这个工具基于现有节点查询知识图谱以获取响应,并在没有相关实体存在时执行外部搜索。
  • 默认工具:LLaVa支持的计算机视觉
  • 默认工具:使用geocoder和谷歌地图的位置工具

Mem0唱罢,Memary登场!_人工智能_09



def external_query(self, query: str):
    messages_dict = [
        {"role": "system", "content": "Be precise and concise."},
        {"role": "user", "content": query},
    ]
    messages = [ChatMessage(**msg) for msg in messages_dict]
    external_response = self.query_llm.chat(messages)
    return str(external_response)



def search(self, query: str) -> str:
    response = self.query_engine.query(query)
    if response.metadata is None:
        return self.external_query(query)
    else:
        return response

知识图谱

知识图谱 ↔ 大型语言模型 

  • memary使用 Neo4j 图形数据库来存储知识。 
  • Llama Index 被用来根据文档向图形存储中添加节点。 
  • Perplexity(mistral-7b-instruct 模型)被用于外部查询。

Mem0唱罢,Memary登场!_子图_10

知识图谱应用案例 

  • 将最终的Agent响应注入现有的知识图谱中。 
  • memary 使用递归检索方法来搜索知识图谱,这涉及到确定查询中的关键实体是什么,构建这些实体的子图,最大深度为2,最后使用该子图构建上下文。 
  • 面对查询中的多个关键实体时,memary 使用多跳推理将多个子图连接成一个更大的子图进行搜索。 
  • 这些技术与一次性搜索整个知识图谱相比,减少了延迟
def query(self, query: str) -> str:
        # get the response from react agent
        response = self.routing_agent.chat(query)
        self.routing_agent.reset()
        # write response to file for KG writeback
        with open("data/external_response.txt", "w") as f:
            print(response, file=f)
        # write back to the KG
        self.write_back()
        return response



def check_KG(self, query: str) -> bool:
        """Check if the query is in the knowledge graph.
        Args:
            query (str): query to check in the knowledge graph
        Returns:
            bool: True if the query is in the knowledge graph, False otherwise
        """
        response = self.query_engine.query(query)
        if response.metadata is None:
            return False
        return generate_string(
            list(list(response.metadata.values())[0]["kg_rel_map"].keys())
        )

内存模块

记忆模块由记忆流(Memory Stream)和实体知识库(Entity Knowledge Store)组成,记忆模块的设计受到了微软研究院提出的K-LaMP设计的影响。

Mem0唱罢,Memary登场!_子图_11

内存流

记忆流捕获插入到知识图谱中的所有实体及其相关时间戳。此流反映了用户知识的广度,即用户接触过的概念,但无法推断出接触的深度。

  • 时间线分析:绘制互动时间线,突出互动率高的时刻或主题焦点的转变。这有助于了解用户兴趣随时间的变化。
def add_memory(self, entities):
        self.memory.extend([
            MemoryItem(str(entity),
                       datetime.now().replace(microsecond=0))
            for entity in entities
        ])
  • 提取主题:在互动中寻找重复出现的主题或话题。这种主题分析可以帮助预测用户的兴趣或问题,甚至在它们被明确表述之前。
def get_memory(self) -> list[MemoryItem]:
        return self.memory
实体知识库
实体知识存储跟踪对存储在内存流中的每个实体的引用频率和新近度。此知识存储反映了用户的知识深度,即他们比其他人更熟悉的概念。
  • 按相关性对实体进行排序:使用频率和新近度对实体进行排序。经常提及(高计数)和最近引用的实体可能具有很高的重要性,用户非常了解这个概念。
def _select_top_entities(self):
        entity_knowledge_store = self.message.llm_message['knowledge_entity_store']
        entities = [entity.to_dict() for entity in entity_knowledge_store]
        entity_counts = [entity['count'] for entity in entities]
        top_indexes = np.argsort(entity_counts)[:TOP_ENTITIES]
        return [entities[index] for index in top_indexes]
  • 分类实体:根据它们的性质或被提及的上下文将实体分门别类(例如,技术术语、个人兴趣)。这种分类有助于快速获取针对用户查询量身定制的相关资讯。
def _convert_memory_to_knowledge_memory(
            self, memory_stream: list) -> list[KnowledgeMemoryItem]:
        """Converts memory from memory stream to entity knowledge store by grouping entities 
        Returns:
            knowledge_memory (list): list of KnowledgeMemoryItem
        """
        knowledge_memory = []
        entities = set([item.entity for item in memory_stream])
        for entity in entities:
            memory_dates = [
                item.date for item in memory_stream if item.entity == entity
            ]
            knowledge_memory.append(
                KnowledgeMemoryItem(entity, len(memory_dates),
                                    max(memory_dates)))
        return knowledge_memory
  • 突出显示随时间的变化:识别实体排名或分类随时间的任何重大变化。最常提及的实体的变化可能表明用户的兴趣或知识发生了变化。

Mem0唱罢,Memary登场!_深度学习_12

新建上下文窗口

利用与用户相关的关键分类实体和主题来更紧密地定制Agent响应,以适应用户当前的兴趣/偏好和知识水平/专业知识。新的上下文窗口由以下内容组成:

Mem0唱罢,Memary登场!_github_13

  • Agent响应:
def get_routing_agent_response(self, query, return_entity=False):
        """Get response from the ReAct."""
        response = ""
        if self.debug:
            # writes ReAct agent steps to separate file and modifies format to be readable in .txt file
            with open("data/routing_response.txt", "w") as f:
                orig_stdout = sys.stdout
                sys.stdout = f
                response = str(self.query(query))
                sys.stdout.flush()
                sys.stdout = orig_stdout
            text = ""
            with open("data/routing_response.txt", "r") as f:
                text = f.read()
            plain = ansi_strip(text)
            with open("data/routing_response.txt", "w") as f:
                f.write(plain)
        else:
            response = str(self.query(query))
        if return_entity:
            # the query above already adds final response to KG so entities will be present in the KG
            return response, self.get_entity(self.query_engine.retrieve(query))
        return response
  • 最相关的实体:
def get_entity(self, retrieve) -> list[str]:
        """retrieve is a list of QueryBundle objects.
        A retrieved QueryBundle object has a "node" attribute,
        which has a "metadata" attribute.
        example for "kg_rel_map":
        kg_rel_map = {
            'Harry': [['DREAMED_OF', 'Unknown relation'], ['FELL_HARD_ON', 'Concrete floor']],
            'Potter': [['WORE', 'Round glasses'], ['HAD', 'Dream']]
        }
        Args:
            retrieve (list[NodeWithScore]): list of NodeWithScore objects
        return:
            list[str]: list of string entities
        """
        entities = []
        kg_rel_map = retrieve[0].node.metadata["kg_rel_map"]
        for key, items in kg_rel_map.items():
            # key is the entity of question
            entities.append(key)
            # items is a list of [relationship, entity]
            entities.extend(item[1] for item in items)
            if len(entities) > MAX_ENTITIES_FROM_KG:
                break
        entities = list(set(entities))
        for exceptions in ENTITY_EXCEPTIONS:
            if exceptions in entities:
                entities.remove(exceptions)
        return entities
  • 聊天记录(已摘要以避免Token溢出)
def _summarize_contexts(self, total_tokens: int):
        """Summarize the contexts.
        Args:
            total_tokens (int): total tokens in the response
        """
        messages = self.message.llm_message["messages"]
        # First two messages are system and user personas
        if len(messages) > 2 + NONEVICTION_LENGTH:
            messages = messages[2:-NONEVICTION_LENGTH]
            del self.message.llm_message["messages"][2:-NONEVICTION_LENGTH]
        else:
            messages = messages[2:]
            del self.message.llm_message["messages"][2:]
        message_contents = [message.to_dict()["content"] for message in messages]
        llm_message_chatgpt = {
            "model": self.model,
            "messages": [
                {
                    "role": "user",
                    "content": "Summarize these previous conversations into 50 words:"
                    + str(message_contents),
                }
            ],
        }
        response, _ = self._get_gpt_response(llm_message_chatgpt)
        content = "Summarized past conversation:" + response
        self._add_contexts_to_llm_message("assistant", content, index=2)
        logging.info(f"Contexts summarized successfully. \n summary: {response}")
        logging.info(f"Total tokens after eviction: {total_tokens*EVICTION_RATE}")

最后感谢Memary、supermemory、redcache-ai的开源,期待更多的LLM应用开源项目(RAG、Agent、KG等等)出现!



https://github.com/kingjulio8238/Memary
https://github.com/supermemoryai/supermemory
https://github.com/chisasaw/redcache-ai