Deleting Columns and Rows from a Kinship Matrix in R Using dimnames and Subset Methods

Deleting Columns and Rows from a Matrix by Name (R)

As data analysts and scientists, we frequently encounter matrices and datasets that require manipulation. In this article, we’ll explore how to delete columns and rows from a matrix based on specific names in R.

Introduction

A kinship matrix is a type of matrix used in genetics and genomics to represent the genetic relationships between individuals. It’s typically an n x n matrix where n is the number of individuals, with 1s indicating a relationship (e.g., parent-offspring) and 0s indicating no relationship.

In this article, we’ll focus on using R to delete columns and rows from a kinship matrix based on specific names. We’ll explore the use of dimnames, row and column subseting, and provide examples with dummy data.

Understanding the Problem

Suppose we have a kinship matrix kinstmp representing relationships between individuals, with some individuals missing or having different relationships. We want to delete rows and columns associated with specific IDs (e.g., “ID3”, “ID5”, “ID6”).

Here’s an example of what our matrix might look like:

# Create a sample kinship matrix
kinstmp <- matrix(c(0, 0, 1, 0, 1, 0,
                   0, 0, 1, 0, 0, 0,
                   0, 1, 1, 1, 0, 0,
                   1, 0, 1, 0, 0, 0,
                   0, 0, 1, 0, 1, 0,
                   0, 0, 1, 0, 0, 0),
                  nrow = 6)

Adding Column Names

To subset rows and columns, we need to add column names to our matrix. We can do this using the dimnames argument in the matrix function.

# Add column names to the matrix
nm <- paste0("ID", 1:6) # Name for each row
kinstmp <- matrix(c(0, 0, 1, 0, 1, 0,
                    0, 0, 1, 0, 0, 0,
                    0, 1, 1, 1, 0, 0),
                  nrow = 6, dimnames = list(nm, nm))

Subsetting Rows

To delete rows associated with specific IDs, we can use the %in% operator to create a logical vector of row names that need to be excluded.

# Define the IDs to exclude from rows
id <- c("ID3", "ID5", "ID6")

# Subset rows
kinstmp2 <- kinstmp[!rownames(kinstmp) %in% id, ]

Subsetting Columns

To delete columns associated with specific IDs, we can use the %in% operator again to create a logical vector of column names that need to be excluded.

# Subset columns
kins <- kinstmp2[, !colnames(kinstmp2) %in% id]

Subsetting Rows and Columns in One Call

We can also subset both rows and columns simultaneously using the ! operator.

# Subset rows and columns in one call
kinstmp[!rownames(kinstmp) %in% id, !colnames(kinstmp2) %in% id]

Conclusion

In this article, we explored how to delete columns and rows from a matrix based on specific names using R. By adding column names, subsetting rows and columns, and combining the two operations, we can efficiently manipulate kinship matrices and datasets in R.

Remember to consider the implications of deleting rows and columns when working with matrices, as these operations can significantly impact the structure of your data.


Last modified on 2025-05-01