Hadoop中创建的/output在哪
1. 流程图
flowchart TD
A[创建Hadoop job] --> B[运行Hadoop job]
B --> C[查看输出路径]
2. 甘特图
gantt
title 创建Hadoop Job流程甘特图
dateFormat YYYY-MM-DD
section 创建Job
创建Hadoop Job :active, a1, 2022-01-01, 1d
section 运行Job
运行Hadoop Job :a2, 2022-01-02, 1d
section 查看输出路径
查看输出路径 :a3, 2022-01-03, 1d
3. 整体流程
在Hadoop中创建的输出文件一般存储在Hadoop分布式文件系统(HDFS)中。具体而言,可以按照以下流程进行操作:
- 创建Hadoop Job
- 运行Hadoop Job
- 查看输出路径
4. 详细步骤
4.1 创建Hadoop Job
创建Hadoop Job的过程包括编写MapReduce程序、打包Java代码等步骤。以下是一个简单的示例:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import java.io.IOException;
public class WordCount {
public static class WordCountMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
private final Text word = new Text();
private final LongWritable count = new LongWritable(1);
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
String[] words = line.split("\\s+");
for (String word : words) {
this.word.set(word);
context.write(this.word, this.count);
}
}
}
public static class WordCountReducer extends Reducer<Text, LongWritable, Text, LongWritable> {
private final LongWritable result = new LongWritable();
@Override
protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
long sum = 0;
for (LongWritable value : values) {
sum += value.get();
}
this.result.set(sum);
context.write(key, this.result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Word Count");
job.setJarByClass(WordCount.class);
job.setMapperClass(WordCountMapper.class);
job.setCombinerClass(WordCountReducer.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
FileSystem fs = FileSystem.get(conf);
if (fs.exists(new Path(args[1]))) {
fs.delete(new Path(args[1]), true);
}
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
上述代码示例是一个简单的Word Count程序,用于统计文本中单词的出现次数。
4.2 运行Hadoop Job
在命令行终端中,进入Job所在的目录,执行以下命令来运行Hadoop Job:
hadoop jar wordcount.jar WordCount <input-path> <output-path>
其中,wordcount.jar
是打包后的Java程序文件,<input-path>
是输入文件的路径,<output-path>
是输出文件的路径。
4.3 查看输出路径
输出路径是指Hadoop Job运行后生成的输出文件的存放位置。在Hadoop中,默认的输出路径是/output
。可以通过以下命令查看输出路径:
hadoop fs -ls <output-path>
其中,<output-path>
是你在运行Hadoop Job时指定的输出路径。
如果想要查看输出文件的内容,可以使用以下命令:
hadoop fs -cat <output-path>/part-r-00000
其中,part-r-00000
是输出文件的名称。
5. 总结
以上就是在Hadoop中创建的输出路径