2.1 输入 / 输出流
No relationship with java.util.stream.
2.1.1-2.1.3 读写字节
1) Easiest to use static methods from the java.nio.file.Files class:
1 Path path = Path.of(filenameString); // better than Paths.get(),其实 Paths.get() 调用的就是 Path.of()
2 InputStream in = Files.newInputStream(path);
3 OutputStream out = Files.newOutputStream(path);
2) Get an input stream from any URL:
1 URL url = new URL("http://horstmann.com/index.html");
2 InputStream in = url.openStream();
3) Get an input stream from a byte[] array or write to a byte[] array:
// get an input stream from a byte[] array
byte[] bytes = ...;
InputStream in = new ByteArrayInputStream(bytes);
// Conversely, you can write to a ByteArrayOutputStream and then collect the bytes:
ByteArrayOutputStream out = new ByteArrayOutputStream();
Write to out
byte[] bytes = out.toByteArray();
4) The read method returns a single byte (as an int) or -1 at the end of input:
1 InputStream in = ...;
2 int b = in.read();
3 if (b != -1) { byte value = (byte) b; ...}
It is more common to read bytes in bulk:
1 byte[] bytes = ...;
2 int len = in.read(bytes);
5) No method for reading all bytes from a stream. Here is one solution:
1 ByteArrayOutputStream out = new ByteArrayOutputStream();
2 byte[] bytes = new byte[1024];
3 while ((len = in.read(bytes)) != -1) {out.write(bytes, 0 , len);} // -1: end of the input stream。该方法可以一次读写指定长度的 bytes[]
4 bytes = out.toByteArray();
For files, just call:
1 byte[] bytes = Files.readAllBytes(path); // from Java 9
6) You can write one byte or bytes from an array:
1 OutputStream out = ...;
2 int b = ...;
3 out.write(b); // one byte
4 byte[] bytes = ...;
5 out.write(bytes); // btyes from an array
6 out.write(bytes, start, length);
7) When writing to a stream, close it when you are done:
out.close();
Or better, use a try-with-resources block (resource will be automatically closed):
1 try (OutputStream out = ...) {
2 out.write(bytes);
3 }
8) To save an input stream to a file, call:
1 Files.copy(in, path, StandardCopyOption.REPLACE_EXISTING);
Java 9/10 new feature:
- 1. There is finally a method to read all bytes from an input stream(解决了上面 4) 的限制): byte[] bytes = url.openStream().readAllBytes();
- There is also readNBytes.
- 2. InputStream.transferTo(OutputStream) transfer all bytes from an input stream to an output stream.
- 3. Java 10: Reader.transferTo(Writer)
- 4. Java 10: Character sets in PrintWriter, Scanner, etc. can be specified as Charset instead of String. new Scanner(path, StandCharsets.UTF_8)
- 5. Scanner.tokens gets a stream of tokens, similiar to Pattern.splitAsStream from Java 8: Stream<String> tokens = new Scanner(path).useDelimiter("\\s*,\\s*).tokens();
2.1.4 读写文本文件
1. Summary:
- InputStream/Outputstream process bytes.
- Text files contain characters.
- Java uses Unicode for characters.
- Readers/Writers convert between bytes and characters.
- Always specify the character encoding. Use StandardCharsets.UTF_8 for Charset parameters, "UTF-8"
2. You can obtain a Reader for any input stream:
1 InputStream inStream = ...;
2 Reader in = new InputStreamReader(inStream, charset);
The read methods reads one char value, it's too low-level for most purposes.
1) You can read a short file into a string:
1 String content = new String(Files.readAllBytes(path), charset); // Files.readAllBytes(path) returns byte[], then call new String() to convert to String
2) You can get all lines as a list or stream:
1 List<String> lines = Files.readAllLines(path, charset);
2
3 try (Stream<String> lines = Files.lines(path, charset)) {
4 ...
5 }
3. Use a Scanner to split input into numbers, words, and so on:
Scanner in = new Scanner(path, "UTF-8");
while (in.hasNextDouble()) {
double value = in.nextDouble();
...
}
// To read words, set the delimeter to any sequence of non-letters (sample in textFile\ScannerTest.java):
// method1: in.useDelimiter
in.useDelimiter("\\PL+");
while (in.hasNext()) {
String word = in.next();
...
}
// method2: in.tokens()
Stream<String> words = in.tokens();
4. To write to a file, make one of these calls as following. Then call out.print, out.println, or out.printf to produce output.
1 PrintWriter out = new PrintWriter(Files.newBufferedWriter(path, charset));
2
3 PrintWriter out = new PrintWriter(filenameString, charsetString);
// write data to file
out.println(data);
Remeber to close the file: try (PrintWriter out = ... ) {...}
If you already have the entire output in a string, or a collection of lines, call:
1 Files.write(path, contentString.getBytes(charset));
2 Files.write(path, lines, charset);
You can also append output to a file:
1 Files.write(path, lines, charset, StandardOpenOption.APPEND);
5. Sometimes, a library method wants a Writer object. Example:
1 Throwable.printStackTrace(PrintWriter out)
If you want to capture the output in a string, not a file, use a StringWriter:
1 StringWriter writer = new StringWriter(); // StringWriter 是将一个字符发送到字符串,而不是磁盘文件。另外,StringWriter 本身没有 print 方法,所以需要将其包装到 PrinterWriter 中
2 throwable.printStackTrace(new PrintWriter(writer));
Now you can process the stack trace as string:
1 String stackTrace = writer.toString();
下列方法适合处理中等长度的文本文件:
- Files.readAllBytes()、Files.readString()、Files.readAllLines()、Files.writeString()、Files.write()、
下列方法适合处理大文件或二进制文件:
InputStream in = Files.newInputStream(path);
OutputStream out = Files.newOutputStream(path);
Reader in = Files.newBufferedReader(path, charset); // 返回 BufferReader(),BufferReader类扩展了Reader类
Writer out = Files.newBufferedWriter(path, charset);
2.2/2.5 读写二进制数据
1. 处理二进制文件
DataInput / DataOutput interfaces have methods readInt / writeInt, readDouble / writeDouble, and so on.
Can wrap any stream into a DataInputStream / DataOutputStream:
1 DataInput in = new DataInputStream(new FileInputStream(path));
2 DataOutput out = new DataOutputStream(new FileOutputStream(path));
Reading / writing stream data is sequential.
2. 随机访问文件
2.1 方式一: RandomAccessFile (section 2.2.2)
"Random access file": You can jump to any file position and start reading/writing. Open with "r" for reading or "rw" for writing:
1 RandomAccessFile file = new RandomAccessFile(filenameString, "rw");
getFilePointer
seekExample: Increment an integer that you just read:
1 int value = file.readInt();
2 file.seek(file.getFilePointer() - 4); // 第1句读取一个整数,此时位置偏移。此时读取当前位置 - 4(整数长度),即回到了刚才的位置
3 file.writeInt(value + 1);
2.2 方式二:内存映射文件 Memory-Mapped Files(section 2.5)
A memory-mapped file provides very efficient random access for large files. (Uses operating system mechanism for virtal memory.)
// step1: Get a channel for the file:
FileChannel channel = FileChannel.open(path, StandardOpenOption.READ, StandOpenOption.WRITE);
// step2: Map an area of the file (or all of it) into memory:
ByteBuffer buff = channel.map(FileChannel.MapMode.READ_WRITE, 0, channel.size());
// step3: You use methods get, getInt, getDouble, and so on to read, and the equivalent put methods to write:
int position = ...;
int value = buffer.getInt(position);
buffer.put(position, value + 1);
The file is updated at some point, and certainly when the channel is closed (can use with try-with-resources).
2.4 操作文件(创建、访问、删除文件和目录): Path, Files
1. Working with Path
Path objects specify abstract path names (which may not currently exist on disk). Sequence of directory names, optionally followed by a file name. First component may be a root component such as / or C:\.
Paths.get / Path.of
1 Path absolute = Paths.get("/", "home", "cay"); // start with root
2 Path relative = Paths.get("myapp", "conf", "user.properties");
Path separator / or \ is suppiled for the default file system. If you know which platform your program is running, you can provide a string with separators:
1 Path homeDirectory = Paths.get("/home/cay");
resolve(q) computes "p then q". If q is absolute, that's just q, otherwiszie, first follow p, then follow q:
1 Path workPath = homeDirectory.resolve("myapp/work");
relativize, yielding "how to get from p to q".
1 Paths.get("/home/cay").relativize(Paths.get("/home/fred/myapp"))
2 // yields "../fred/myapp"
normalize
toAbsolutePath
2. Taking Paths
Utility methods to get at the most important parts:
1 Path p = Paths.get("/home", "cay", "myapp.properties");
2 Path parent = p.getParent(); // The path /home/cay
3 Path file = p.getFileName(); // The last element, myapp.properties
4 Path root = p.getRoot(); // The initial segment / (null for a relative path)
5 Path first = p.getName(0); // The first element, home
6 Path dir = p.subpath(1, p.getNameCount()); // All but the first element, cay/myapp.properties
You can iterate over the components:
1 for (Path component : path) {
2 ...
3 }
To interoperate with legacy File class, use:
1 File file = path.toFile();
2 Path path = file.toPath();
3. Files
2.4.3 To create a new directory, call:
1 Files.createDirectory(path); // All but the last component must exist。仅创建下一级目录
2 Files.createDirectories(path); // Missing components are created. 创建路径中的中间目录即可创建多级目录
You can create an empty file, If the file exists, an exception occurs. Check and creation are atomic.
1 Files.createFile(path);
Convencience methods for creating temporary files:
1 Path tempFile = Files.createTempFile(dir, prefix, suffix);
2 Path tempFile = Files.createTempFile(prefix, suffix);
3
4 Path tempDir = Files.createTempDirectory(dir, prefix);
5 Path tempDir = Files.createTempDirectory(prefix);
Files.createTempFile(null, ".txt")
Files.exists(path)
Files.isDirectory(path), Files.isRegularFile(path), Files.isSymbolicLink(path) to find out whether the path is directory, file, or symlink. More infor: isHidden, isExecutable, isReadable, isWritable
Files.size(path)
2.4.4 Use the copy or move method:
1 Files.copy(fromPath, toPath);
2 Files.move(fromPath, toPath);
Can define behavior with copy options:
1 Files.copy(fromPath, toPath, StandardCopyOption.REPLACE_EXISTING, StandardCopyOption.COPY_ATTRIBUTES);
2 Files.move(fromPath, toPath, StandardCopyOption.ATOMIC_MOVE);
Delete a file like this:
1 Files.delete(path); // throws exception if path doesn't exist
2 boolen deleted = Files.deleteIfExists(path);
2.4.6 yields a Stream<Path> of the directory entries. The directory is read lazily -- efficient for huge directories. Be sure to close the stream. (Files.list 不会进入子目录)
1 try (Stream<Path> entries = Files.list(pathToDirectory)) {...}
Files.walk(dirpath)
1 try (Stream<Path> entries = Files.walk(pathToRoot)) {
2 entries.foreach(System.out.println);
3 }
find
1 Files.find(path, maxDepth, (path, attr) -> attr.size() > 10000)
Files.walk
Files.walk(source).forEach(p -> {
try {
Path q = target.resolve(source.relativize(p));
if (Files.isDirectory(p)) Files.createDirectory(q);
else Files.copy(p, q);
catch (IOException ex) {
throw new UncheckedIOException(ex);
}
});
Need to vist children before deleting the parent. => Use FileVisitor
// Delete the directory tree starting at root
1 Files.walkFileTree(root, new SimpleFileVisitor<Path>() {
2 public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
3 Files.delete(file);
4 return FileVisitResult.CONTINUE;
5 }
6 public FileVisitResult postVisitDirectory(Path dir, IOException ex) throws IOException {
7 if (ex != null) throw ex;
8 Files.delete(dir);
9 return FileVisitResult.CONTINUE;
10 }
11 });
Paths class looks up paths in the default file system.
4. ZIP file sytem
Can have file system for the files in a ZIP archive:
1 FileSystem zipfs = FileSystems.newFileSystem(Paths.get(zipname), (ClassLoader) null);
Copy out a file if you know its name:
1 Files.copy(zipfs.getPath(sourceName), targetPath);
To list all files in an archive, walk the file tree:
1 Files.walk(zipfs.getPath("/", forEach(p -> { Process p });
Here is the magic incantation for creating a zip file:
1 Path zipPath = Paths.get("myfile.zip);
2 URI uri = new URI("jar", zipPath.toUri().toString(), null); // uri: jar:file:///C:/Users/xxxxxx/IdeaProjects/trunk/lessonlearn_coreJava/1.zip
3 // Constructs the URI jar:file://myfile.zip
4 try (FileSystem zipfs = FileSystems.newFileSystem(uri, Collections.singletonMap("create", "true"))) {
5 // To add files, copy them into the ZIP file system
6 Files.copy(sourcePath, zipfs.getPath("/").resolve(targetPath));
7 }
5. Java 11 新特性
- String.lines yields a stream of all lines in a string;
- String.strip trims Unicode whitespace;
- Path.of does the same as Paths.get -- more consistent and shorter;
- Files.readString reads a file into a string;
- OutputStream nullOutputStream() provides a null stream;
- Analogous methods forInputStream, Reader, Writer;
2.x 处理 互联网上的数据
You can read data from a given URL. That gets you the contents of the URL(from the GET request).
1 URL url = new URL("http://hostmann.com/index.html");
2 InputStream in = url.openStream();
Sometimes, you need to use the URLConnection class for more complex cases:
- Making a POST request
- Setting request headers
- Reading response headers
// 1. Get an URLConnection object:
URLConnection connection = url.openConnection();
// 2. Set request properties:
connection.setRequestProperty("Accept-Charset", "UTF-8, ISO-8859-1");
// 3. Send data to the server:
connection.setDoOutput(true);
try (OutputStream out = connection.getOutputStream()) { Write to out }
// 4. Read the response headers:
connection.connect(); // If you skipped step 3
Map<String, List<String>> headers = connection.getHeaderFields();
// 5. Read the response:
try (InputStream in = connection.getInputStream()) { Read from in }
application/x-www-form-urlencoded. But you still need to encode the name/value pairs.
Suppose POST data are given in a map:
URLConnection connection = url.openConnection();
connection.setDoOutput(true);
try (Writer out = new OutputStreamWriter(connection.getOutputStream(), StandardCharsets.UTF_8)) {
boolean first = true;
for (Map.Entry<String, String> entry : postData.entrySet()) {
if (first) first = false;
else out.write("&");
out.write(URLEncoder.encode(entry.getKey(), "UTF-8");
out.write("=");
out.write(URLEncoder.encode(entry.getValue(), "UTF-8");
}
}
Java 9 HttpClient:
// Build a client:
HttpClient client = HttpClient.newBuilder()
.fllowRedirects(HttpClient.Redirect.ALWAYS)
.build();
// Build a request:
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("http://horstmann.com"))
.GET()
.build();
// Get and handle response:
HttpResponse<String> reponse = client.send(request, HttpResponse.BodyHandlers.ofString());
// Asynchronous processing:
Client.sendAsync(request, HttpResponse.BodyHandlers.ofString())
.completeOnTimeout("<html></html>", 10, TimeUnit.SECONDS)
.thenAccept(response -> Process response.body());
2.7 正则表达式
2.7.1 基本语法
Regualr expressions (regex) specify string patterns.
- The regex [Jj]e?a.+
- Special characters . * + ? { | ( ) [ \ ^ $
- .matches any character, * is 0 or more, + 1 or more, ? 0 or 1 repetition
- Use braces for other multiplicities such as {2, 4}
- | denotes alternatives: (Java|Scala)
- ()are used for grouping
- [...]
- Useful predefined character classes such as \s (space), \pL (Unicode letters), completements(补集,即与前面相反) \S, \PL
- ^ and $
- Escape special character with \ to match them literally
- Caution: Must double-escape \ in Java strings
Two principal ways to use a regex:
- 应用一:Find all matches within a string;
- 应用二:Find whether the entire string matches
应用一:This loop iterates over all matches of a regex in a string:
1 Pattern pattern = Pattern.compile(regexString);
2 Matcher matcher = pattern.matcher(input);
3 while (matcher.find()) {
4 String match = matcher.group();
5 ...
6 }
matcher.start(), matcher.end()
应用二:Use the matches method to check wheter a string matches a regex:
1 String regex = "[12]?[0-9]:[0-5][0-9][ap]m";
2 if (Pattern.matches(regex, input)) { ... }
Compile the regex if you need it repeatedly:
1 Pattern pattern = Pattern.compile(regex);
2 Matcher matcher = patter.matcher(input);
3 if (matcher.matches()) ...
Can turn the pattern into a predicate:
1 Stream<String> result = streamOfStrings.filter(pattern.asPredicate());
Use groups to match subexpressions. Group index values start with 1.
// Example: Match records such as: Blackwell Toaster USD29.95
// 1. Regex with groups:
// step1: notes: \p{Alnum} 是预定义字符类,等同于 [A-Za-z0-9]
(\p{Alnum}+(\s+\p{Alnum}+)*)\s+([A-Z]{3})([0-9.]*)
// step2: Use the group method to get at each group"
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
item = matcher.group(1); // Blackwell Toaster
currency = matcher.group(3); // USD
price = matcher.group(4); // 29.95
}
// 2. Clearer with named groups:
(?<item>\p{Alnum}+(\s+\p{Alnum}+)*)\s+(?<currency>[A-Z]{3})(?<price>[0-9.]*)
// then you retrive items by name:
item = matcher.group("item");
2.7.4 分隔符分割
1 // Specify the delimiter as a regex:
2 Pattern commas = Pattern.compile("\\s*,\\s*");
3 String[] tokens = commas.split(input); // String "1, 2, 3" truns into array ["1", "2", "3"]
4
5 // Fetch result lazily for large inputs:
6 Stream<String> tokens = commas.splitAsStream(input);
7
8 // If you don't care about efficiency, just use the String.split method:
9 String[] tokens = input.split("\\s*,\\s*");
2.7.5 替换匹配
// To replace all matches, can replaceAll on the matcher
Matcher matcher = commas.matcher(input);
String result = matcher.replaceAll(", ");
// If you don't care about efficiency, just use the String.replaceAll method:
String result = input.replaceAll("\s*,\s*", ", ");
// Group numbers $n or names $name are replaced with the captured group:
String result = "3:45".replaceAll(
"(\\d{1,2}):(?<minutes>\\d{2})",
"$1 hours and ${minutes} minutes");
Java 9/10 关于 正则表达式的改进:
1) Matcher.stream and Scanner.findAll gets a stream of match results:
1 Pattern pattern = Pattern.compile("[^,]");
2 Stream<String> matches = pattern.match(str).results().map(MatchResult::group);
3
4 matches = new Scanner(path).findAll(pattern).map(MatchResult::group);
2) Matcher.replaceFirst / replaceAll now have a version with a replacement function:
1 String result = Pattern.compile("\\pL{4,}")
2 .matcher("Mary had a little lamb)
3 .replaceAll(m -> m.group().toUpperCase());
4 // yields "MARY had a LITTLE LAMB"
2.3 序列化
实际应用中,存储数据方式:
- 存储相同类型的数据 => 可用固定长度的记录格式 (如示例 randomAccess\Employee.java,需要定义固定长度的变量)
- 对象 => 序列化(如示例 objectStream\Employee.java,需要实现 Serializable)
Serialization :an object -> a sequence of bytes. Deserailization:a sequence of bytes -> an object.
Useful for sending objects to a different computer and short-term storage (e.g. cache). Not intended for long-term storage.
Participating classes implement the serializable marker interface:
public class Employee implements Serializable { ... }
// 1. 输出流
// 1.1 Construct an ObjectOutputStream object:
ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path));
// 1.2 Call the writeObject method:
Employee peter = new Employee("Peter", 90000);
Employee paul = new Manager("Paul", 180000);
out.writeObject(peter);
out.writeObject(paul);
// 2. 输入流
// 2.1 Construct an ObjectInputStream object:
ObjectInputStream in = new ObjectInputStream(Files.newInputStream(path)); // 对于 Employee 类,其包含字符串和浮点数,这些都是可串行化的
// 2.2 Retrieve the objects in the same order as they were saved:
Employee e1 = (Employee) in.readObject();
Employee e2 = (Employee) in.readObject();
使用 writeObject 方法写这些对象,要想正常工作,需要满足两个条件:
- 1. 这个类需要实现 Serializable 接口;
- 2. 这个类的所有实例变量也必须是可串行化的;
// Consider this network of objects, 一个对象被多个对象共享时: 需要保存这样的对象网络
Employee peter = new Employee("Peter", 40000);
Manager paul = new Manager("Paul", 105000);
Manager mary = new Manager("Mary", 180000);
paul.setAdmin(peter);
mary.setAdmin(peter);
ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path));
out.writeObject(peter);
out.writeObject(paul);
out.writeObject(mary);
对象序列化的算法是:
1)保存时:
- 对遇到的每一个对象引用都关联一个序列号(serial number);
- 对于每一个对象,当第一次遇到时,保存其对象数据到输出流中;
- 如某个对象之前被保存过,只写出“与之前保存过的序列号为 x 的对象相同”
2)读出时:
- 对于对象输入流中的对象,在第一次遇到其序列号时,构建它,并使用流中数据来初始化它,然后记录这个顺序号和新对象之间的关联;
- 当遇到“与之前保存过的序列号为 x 的对象相同”这一标记,获取与这个序列号相关联的对象引用;
Declare fields that shouldn't be serialized with the transient
You can take over serialization of fields by implementing the readObject / writeObject
You can delegate serialization and deserialization to a proxy by implementing the readResolve/writeReplace methods. (Useful in rare cases when object identity needs to be preserved.)
You can declare multiple versions of serializations.
- Default serialVersionUID is obtained by hashing fields names and types.
- If the serialVersionUID changes, readObject throws an exception.
- You can declare your own version ID and implement deserialization to conside multiple versions. private static final long serialVersionUID = 2L; // Version 2
- Complex and raraly useful.