Wiz Research has identified the public clickhouses database that belongs to Deepseek. This allows you to completely control the database operation, such as the function to access internal data. Exposure contains more than 1 million logstreams, including chat history, secret key, backend details, and other very sensitive information. The WIZ survey team was responsible for the immediate Deepseek, which had immediately secured the exposure.
In this blog post, we will explain our discoveries in detail and examine the broader meaning for the whole industry.
Deepseek, a Chinese AI startup, has recently attracted the attention of the media with a groundbreaking AI model, especially the Deepseek-R1 reasoning model. This model is comparable to major AI systems such as O1 of O1 of Performai in Performance, making costs and efficiency stand out.
As DeepSeek created waves in AI space, the Wiz Research team evaluated external security and tried to identify potential vulnerabilities.
Within a few minutes, a open clickhouse database linked to Deepseek, which is completely open and linked to unrecognized confidential data, was found. It was hosted by oauth2callback.deepseek.com:9000 and Dev.deepSeek.com:9000.
This database contained a considerable amount of chat history, back -end data, and confidential information, such as logstream, API secrets, and details of operation.
More importantly, exposure enables complete database control and has potential privileged escalations in the DeepSeek environment.
Our reconnaissance has begun with Deepseek’s open -term evaluation. By mapping the external attack surface with simple reconnaissance techniques (passive and aggressive discovery of subdomains), about 30 Internet subdomains have been identified. Most of them appeared in benign and had hosting elements such as chat bot interfaces, status pages, and API documents, but at first they did not propose high -risk exposure.
However, when searching was extended beyond the standard HTTP port (80/443), two abnormal open ports (8123 and 9000) related to the following host were detected.
Further investigating, these ports have led to a publicly open CLICKHOUSE database that can be accessed without authentication.
CLICKHOUSE is an open source pillar database management system designed for high -speed analysis queries with large datasets. Developed by YANDEX, it is widely used for real -time data processing, logstrosage, and big data analysis. This indicates exposure as a very valuable and delicate discovery.
I accessed /play path by using the clickhouse HTTP interface. As a result, any SQL query was performed directly through the browser. Execute a simple show table. The query has returned a complete list of accessible datasets.
Among them, one table was outstanding: Log_stream contained a wide range of logs, including very confidential data.
The Log_Stream table contains more than 1 million log entries, especially columns.
Time Stamp -Log from January 6, 2025
Span_name -See various internal Deepseek API endpoints
String.Values -Plain text log containing chat history, API key, backend details, operation metadata
_Service -indicates which Deepseek service has generated logs
_SOURCE -Container of Chat History, API key, directory structure, and the origin of logic quests including chatbot metadodate tallogs
This level has brought Deepseek’s own security and its end users. Attackers may not only get confidential logs and actual plain text chat messages, but also use the next query to describe plain text passwords and local files from the server. Click house configuration.
(Note: In order to maintain ethical research practices, we did not execute an invading query beyond the enumeration.)
A quick recruitment of AI services without a corresponding security is essentially dangerous. This exposure emphasizes the fact that the immediate security of the AI application is due to infrastructure and tools that support them.
Many AI security precautions focus on future threats, but the actual dangers often come from basic risks, such as accidental exposure of databases. These risks, which are the basics of security, must continue to be a top priority for security teams.
Since we are in a hurry to adopt AI tools and services from startups and providers that are increasing, it is essential to remember that we are outsourcing delicate data to these companies. The rapid pace of recruitment often leads to overlooking security, but protecting customer data must be a top priority. It is important for the security team to work closely with the AI engineer to ensure the visibility of the architecture, tools and models used. Therefore, it can protect data and prevent exposure.
The world has never seen the technology adopted at the AI pace. Many AI companies are usually growing rapidly to an important infrastructure provider without such a wide range of recruitment without security frameworks. When AI is deeply integrated into companies around the world, the industry needs to recognize the risk of processing confidential data and carry out the same security practices for public cloud providers and major infrastructure providers.