Simple Scala interface for HDFS filesystem operations.
Add to your build.sbt
:
libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.8.1"
libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.8.1"
libraryDependencies += "com.typesafe.scala-logging" %% "scala-logging" % "3.9.4"
Import the library and initialize your filesystem:
import dfs._
import org.apache.hadoop.fs.FileSystem
implicit val fs: FileSystem = yourHadoopClusterInstance.getFileSystem()
Create a file with automatic parent directory creation:
val created = touch("path/to/file.txt")
Create with custom parameters:
val created = touch(
path = "path/to/file.txt",
overwrite = true,
bufferSize = 8192,
replicationFactor = 3,
blockSize = 268435456
)
Create a directory and all parent directories:
val created = mkdir("path/to/directory")
Basic move operation:
val moved = mv("source/path", "destination/path")
Move into a directory (creates parents if needed):
val moved = mv.into("source/path", "destination/directory")
Move with overwrite:
val moved = mv.over("source/path", "destination/path")
Copy a single file:
cp(fs, "source/file.txt", "destination/file.txt")
Copy directories recursively:
cp.recursive(fs, "source/directory", "destination/directory")
Remove a file:
val removed = rm("path/to/file.txt")
Remove directories recursively:
val removed = rm.r("path/to/directory")
val fileExists = exists("path/to/file.txt")
val isDir = isDirectory("path/to/directory")
val isFile = isFile("path/to/file.txt")
Get file size:
val fileSize = size("path/to/file.txt")
Get comprehensive file metadata:
val metadata = stat("path/to/file.txt")
println(s"Size: ${metadata.size} bytes")
println(s"Owner: ${metadata.owner}")
println(s"Permissions: ${metadata.permissions}")
List files in directory:
val files = ls(fs, "path/to/directory")
List with detailed information:
val details = ls.details(fs, "path/to/directory")
println(details)
Read entire file:
val content = cat(fs, "path/to/file.txt")
Read with line numbers:
val content = cat.numbered(fs, "path/to/file.txt")
Read first N lines:
val head = cat.head(fs, "path/to/file.txt", lines = 20)
Read last N lines:
val tail = cat.tail(fs, "path/to/file.txt", lines = 10)
Change file owner:
chown("path/to/file.txt", "newowner")
Change owner and group:
chown("path/to/file.txt", "newowner", "newgroup")
Recursive ownership change:
chown.r("path/to/directory", "newowner", "newgroup")
Set file permissions using Unix-style permissions:
val permissions = Perm("755")
chmod("path/to/file.txt", permissions)
All operations return Boolean
for success/failure or throw exceptions for critical errors. Failed operations are logged with detailed error messages.
Run tests:
sbt test
Special thanks to @lihaoyi for his Scala libraries, particularly OS-Lib, which heavily inspired this library's design patterns.