[NWRRC2013] Heavy Chain Clusterization

ID: 11095

远端评测题

2000ms

256MiB

尝试: 0

已通过: 0

难度: 6

上传者:

Hydro

标签>

2013

Special Judge

ACM_ICPC

题目描述

A group of biologists is trying to find a cure for a viral disease. They have tried many antibodies of various origins that could potentially fight the viral antigens, and have selected $n$ antibodies that seem to work best during experiments.

Each antibody is identified by its heavy chain -- a sequence of amino acids.

The set of antibodies form a similarity cluster, if at least one of the following holds:

k-prefixes (first $k$ amino acids) of all their heavy chains are equal;

k-suffixes (last $k$ amino acids) of all their heavy chains are equal.

In order to simplify the future research, biologists want to group antibodies to similarity clusters.

You need to split the given antibodies to a minimum number of similarity clusters.

输入格式

The first line contains two integers $n$ and $k$ -- the number of heavy chains and the length of sequence of amino acids to coincide $(1 \le n \le 5 000 , 1 \le k \le 550)$ .

The following $n$ lines contain sequences of amino acids that form heavy chains of antibodies. Each amino acid described with an uppercase English letter. Each heavy chain contains at least $k$ and no more than $550$ amino acids.

输出格式

The first line of output must contain a single integer -- the minimum number of similarity clusters. The following lines must contain descriptions of clusters, one per line.

Each description starts with $m_i$ -- the number of antibodies in the cluster and is followed by $m_i$ integers -- numbers of these antibodies. Antibodies are numbered in the order of appearance in the input starting from one.

Each antibody must be present in exactly one cluster.

题目大意

现有 $n$ 个抗生素，每一个抗生素都可以通过它的一个氨基酸序列被辨认。

如果有一组抗生素满足以下任意一个条件，这些抗生素才可以形成一个“相似群体”：

组内所有抗生素的氨基酸序列的前 $k$ 个氨基酸元素组成的序列全部相同
组内所有抗生素的氨基酸序列的后 $k$ 个氨基酸元素组成的序列全部相同

特别的，一种抗生素可以成为一个相似群体。

为了简化后续研究，生物学家们想要给抗生素分类，使得这 $n$ 个抗生素被分成若干个相似群体。你需要找出一种方案，使得分类后的相似群体数量最少。

第一行包含两个正整数， $n$ 和 $k$ ，含义如题意描述。（ $1≤n≤5000，1≤k≤550$ ）

接下来有 $n$ 行，每行包含一个氨基酸序列。氨基酸序列被描述为由大写英语字母组成的字符串，每个序列包含不少于 $k$ 个，不多于 $550$ 个英文字符。每个英文字符代表一种不同的氨基酸。

第一行包含一个正整数 $ans$ ，表示被分成的相似群体最小可能的数目。

接下来有 $ans$ 行，每行第一个数为 $m_i$ ，表示第 $i$ 个相似群体有 $m_i$ 个元素。接下来有 $m_i$ 个数，第 $j$ 个数表示这个相似群体包含输入中的第 $j$ 个抗生素。（编号从 $1$ 开始。）

每一个抗生素必须恰好出现一次。

4 1
AA
AB
BB
BA

2
2 1 2
2 3 4

3 2
ABA
BAB
XY

提示

Time limit: 2 s, Memory limit: 256 MB.

luogu#P7086. [NWRRC2013] Heavy Chain Clusterization

题目描述

输入格式

输出格式

题目大意

提示

状态

开发

支持

luogu#P7086. [NWRRC2013] Heavy Chain Clusterization

[NWRRC2013] Heavy Chain Clusterization

题目描述

输入格式

输出格式

题目大意

提示

状态

开发

支持

还没有账户？

登录