Hadoop中创建的/output在哪

1. 流程图

flowchart TD
    A[创建Hadoop job] --> B[运行Hadoop job]
    B --> C[查看输出路径]

2. 甘特图

gantt
    title 创建Hadoop Job流程甘特图
    dateFormat  YYYY-MM-DD
    section 创建Job
    创建Hadoop Job           :active, a1, 2022-01-01, 1d
    section 运行Job
    运行Hadoop Job           :a2, 2022-01-02, 1d
    section 查看输出路径
    查看输出路径            :a3, 2022-01-03, 1d

3. 整体流程

在Hadoop中创建的输出文件一般存储在Hadoop分布式文件系统(HDFS)中。具体而言,可以按照以下流程进行操作:

  1. 创建Hadoop Job
  2. 运行Hadoop Job
  3. 查看输出路径

4. 详细步骤

4.1 创建Hadoop Job

创建Hadoop Job的过程包括编写MapReduce程序、打包Java代码等步骤。以下是一个简单的示例:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;

public class WordCount {
    public static class WordCountMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
        private final Text word = new Text();
        private final LongWritable count = new LongWritable(1);

        @Override
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String line = value.toString();
            String[] words = line.split("\\s+");

            for (String word : words) {
                this.word.set(word);
                context.write(this.word, this.count);
            }
        }
    }

    public static class WordCountReducer extends Reducer<Text, LongWritable, Text, LongWritable> {
        private final LongWritable result = new LongWritable();

        @Override
        protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
            long sum = 0;

            for (LongWritable value : values) {
                sum += value.get();
            }

            this.result.set(sum);
            context.write(key, this.result);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "Word Count");
        job.setJarByClass(WordCount.class);

        job.setMapperClass(WordCountMapper.class);
        job.setCombinerClass(WordCountReducer.class);
        job.setReducerClass(WordCountReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        FileSystem fs = FileSystem.get(conf);
        if (fs.exists(new Path(args[1]))) {
            fs.delete(new Path(args[1]), true);
        }

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

上述代码示例是一个简单的Word Count程序,用于统计文本中单词的出现次数。

4.2 运行Hadoop Job

在命令行终端中,进入Job所在的目录,执行以下命令来运行Hadoop Job:

hadoop jar wordcount.jar WordCount <input-path> <output-path>

其中,wordcount.jar是打包后的Java程序文件,<input-path>是输入文件的路径,<output-path>是输出文件的路径。

4.3 查看输出路径

输出路径是指Hadoop Job运行后生成的输出文件的存放位置。在Hadoop中,默认的输出路径是/output。可以通过以下命令查看输出路径:

hadoop fs -ls <output-path>

其中,<output-path>是你在运行Hadoop Job时指定的输出路径。

如果想要查看输出文件的内容,可以使用以下命令:

hadoop fs -cat <output-path>/part-r-00000

其中,part-r-00000是输出文件的名称。

5. 总结

以上就是在Hadoop中创建的输出路径